Microsoft’s AI division just unveiled two new systems: MAI-Voice-1 AI and MAI-1-preview. The star of the show is MAI-Voice-1, a speech model that can whip up a full minute of audio in under a second – and it only needs a single GPU to do it. Meanwhile, MAI-1-preview is described as “a glimpse of future offerings inside Copilot.”
I gave MAI-Voice-1 a spin, and I have to admit, it is impressive. The audio it produces is almost impossible to tell apart from a real person talking. That both blows my mind and annoys me at the same time, because once again, AI tools are being positioned as replacements for human creativity. But that’s a rant for another day.
Microsoft isn’t wasting time putting its new model to work. MAI-Voice-1 already powers Copilot Daily, where an AI host reads out the latest news, and it is also being used to create podcast-style conversations that explain complex topics.
If you want to try it yourself, head over to Copilot Labs. You can type in what you want the AI to say, tweak the voice, and even change the speaking style. On top of that, Microsoft also launched MAI-1-preview, which it trained on roughly 15,000 Nvidia H100 GPUs. This one isn’t about speech – it’s built to follow instructions and give helpful text-based answers to everyday questions.
The plan is to integrate MAI-1-preview into Copilot for certain text-related tasks. Until now, Copilot has relied mostly on OpenAI’s large language models. Microsoft has also started putting MAI-1-preview through its paces on the public benchmarking site LMArena.
We have big ambitions for where we go next. Not only will we pursue further advances here, but we believe that orchestrating a range of specialized models serving different user intents and use cases will unlock immense value. There will be a lot more to come from this team on both fronts in the near future. We’re excited by the work ahead as we aim to deliver leading models and put them into the hands of people globally.
– Microsoft, August 2025
Of course, Microsoft isn’t the only one making moves. OpenAI, which still works closely with Microsoft, recently launched its most advanced model yet: ChatGPT 5. GPT-5 is designed as a unified system that knows when to keep answers short and when to go all in with detailed, expert-level responses.
Meanwhile, Google is pushing forward on the visual side of things. Its DeepMind team released a brand-new image editing model with the odd name “nano banana.” The big selling point? It keeps your appearance intact while editing, something users have been asking for all along. Google also rolled out Gemini 2.5 Flash Image, its most powerful image-generation model to date.
So yeah, the AI race isn’t slowing down anytime soon. Microsoft, OpenAI, Google – they’re all dropping new models and capabilities one after another. Expect the pace to only get faster from here.
“Iconic Phones” is coming this Fall!
Good news everyone! Over the past year we’ve been working on an exciting passion project of ours and we’re thrilled to announce it will be ready to release in just a few short months.