By using this site, you agree to the Privacy Policy and Terms of Use.
Accept
World of SoftwareWorld of SoftwareWorld of Software
  • News
  • Software
  • Mobile
  • Computing
  • Gaming
  • Videos
  • More
    • Gadget
    • Web Stories
    • Trending
    • Press Release
Search
  • Privacy
  • Terms
  • Advertise
  • Contact
Copyright © All Rights Reserved. World of Software.
Reading: Microsoft 365 Copilot and the end of the single-model era in enterprise AI
Share
Sign In
Notification Show More
Font ResizerAa
World of SoftwareWorld of Software
Font ResizerAa
  • Software
  • Mobile
  • Computing
  • Gadget
  • Gaming
  • Videos
Search
  • News
  • Software
  • Mobile
  • Computing
  • Gaming
  • Videos
  • More
    • Gadget
    • Web Stories
    • Trending
    • Press Release
Have an existing account? Sign In
Follow US
  • Privacy
  • Terms
  • Advertise
  • Contact
Copyright © All Rights Reserved. World of Software.
World of Software > Computing > Microsoft 365 Copilot and the end of the single-model era in enterprise AI
Computing

Microsoft 365 Copilot and the end of the single-model era in enterprise AI

News Room
Last updated: 2026/04/09 at 3:10 PM
News Room Published 9 April 2026
Share
Microsoft 365 Copilot and the end of the single-model era in enterprise AI
SHARE
Steve Gustavson, Microsoft’s corporate vice president for design and research. (Microsoft Photo)

[Editor’s Note: Agents of Transformation is an independent GeekWire series, underwritten by Accenture, exploring the adoption and impact of AI and agents. See coverage of our related event.]

Using an AI model still comes with an unspoken asterisk: Verify before you act. Fact-check it. Google it. Ask a colleague. The burden of accuracy has always landed on the human at the end of the day. But Microsoft thinks it has a way to shift that burden — have two AIs keep tabs on each other.

In an era when workforce tasks are increasingly being handled by AI agents, this multi-model strategy now reaches into something human workers assumed was theirs alone: the judgment call. The human-in-the-loop had long been the one non-negotiable in AI workflows. Microsoft’s approach doesn’t eliminate it, but it does raise the question of how much of that role we’re willing to hand over.

‘Two heads are better than one’

Microsoft isn’t alone in this bet. Amazon Web Services, Google, and others are building platforms that give enterprises access to multiple models through a single interface. 

AWS Bedrock offers access to foundation models from multiple providers, while Google’s Gemini Enterprise presents a single front door for workplace AI. Microsoft’s distinction is that it’s embedding multi-model review directly into a productivity tool used by millions of workers.

We saw the first implementation of this plan last week with new upgrades to Microsoft 365 Copilot. Its Researcher agent can now use OpenAI’s GPT to draft a response, then have Anthropic’s Claude review it for accuracy, completeness, and citation quality before finalizing it. 

“We intentionally want a diversity of opinions,” Steve Gustavson, Microsoft’s corporate vice president for design and research, told GeekWire in an interview. “Two heads are better than one when they come together.”

That’s not a trivial concern. Research has already shown that AI users tend to outsource critical thinking to models they perceive as authoritative. If we’re already surrendering judgment to a single model, can having a second one push back on the first be the check that’s been missing? 

It’s a question Microsoft has been wrestling with in designing Critique and Council, the two new features within its Researcher agent.

“Our research consistently shows that workers continue to crave both deeper trust in AI and quality content,” Gustavson said. “People are either over-trusting AI — accepting claims they shouldn’t — or under-trusting it and not getting the full value. Both are design and technical opportunities.”

Take Microsoft’s Critique feature, for example. Gustavson said Microsoft designed it around a deliberate handoff: GPT leads the generation, and Claude steps in as the reviewer. 

“The separation matters because evaluation is a different cognitive mode than generation,” he said. “When one model does both, you get the same blind spots twice. When a second model’s job is to validate the first, you get something structurally different.”

This creates a “powerful feedback loop that delivers higher-quality results across factual accuracy, analytical breadth, and presentation,” Gaurav Anand, Microsoft’s corporate vice president for engineering, wrote in a technical blog post about M365’s Critique feature.

Multi-model isn’t just a proof of concept — it’s live, and it’s already the default experience inside Researcher. But Gustavson is quick to point out that most workers won’t care which models are running under the hood. The models, in his view, should be invisible.

“The average user wants phenomenal outputs. They want to be able to trust them,” he said. “Do they need to know it’s 5.2 versus whatever? I don’t think so.” 

Gustavson disputes that this is a case of the “blind leading the blind,” stressing that tuning the models is how to avoid hallucinations. With Researcher, “Claude has proven to be a fantastic synthesizer and sort of check on what the GPT models might be doing.” 

However, Gustavson said Microsoft is continuously evaluating the performance of single models versus double models, as well as putting “an LLM judge in between the two” to see the trade-offs.

Gustavson said Microsoft plans to move away from promoting specific model names altogether, shifting the focus to what a worker is trying to accomplish. For example, he said, workers could specify that they’re in finance, and Copilot would route work to whichever models best handle Excel, data synthesis, and analysis — no model-picking required.

The enterprise AI pendulum

For Microsoft, multi-model is less of a feature than the inevitable direction of enterprise AI. Gustavson calls it a natural progression, noting that Copilot started out with a single model.

Since then, he said, the industry has been swinging between what models can do, what the product experience should be, and where the competitive moat exists. 

“I think this is just a natural evolution,” he said. “Two models are better than one.”

With models leapfrogging each other every few months, Microsoft isn’t betting on any single one, but rather trying to build something that outlasts them all.

As organizations move from experimenting with AI to depending on it for consequential decisions, the single-model approach starts to show its limits. The question may be less whether enterprises should adopt multi-model than whether they’re ready to accept a system where checks are automated, models are invisible, and AI reviews AI before a human ever sees the output.

Beyond the initial integration into the Researcher agent, Gustavson said Microsoft plans to extend the multi-model approach to its other AI tools. He hopes the approach becomes standard across the industry. In his view, building multi-model review into agentic workflows is both good governance and good design.

For those building agentic experiences, Gustavson’s advice is simple: treat agents like any process with meaningful consequences. The key question: “Who checks the work?”

Sign Up For Daily Newsletter

Be keep up! Get the latest breaking news delivered straight to your inbox.
By signing up, you agree to our Terms of Use and acknowledge the data practices in our Privacy Policy. You may unsubscribe at any time.
Share This Article
Facebook Twitter Email Print
Share
What do you think?
Love0
Sad0
Happy0
Sleepy0
Angry0
Dead0
Wink0
Previous Article The Ticketing Industry Is Breaking. Omar Sarieddine and Ticmint Are Betting on What Comes Next The Ticketing Industry Is Breaking. Omar Sarieddine and Ticmint Are Betting on What Comes Next
Next Article Anthropic says new AI model too dangerous for public release  Anthropic says new AI model too dangerous for public release 
Leave a comment

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

Stay Connected

248.1k Like
69.1k Follow
134k Pin
54.3k Follow

Latest News

Samsung’s Galaxy Watch 8 is easier to recommend now it starts at 0
Samsung’s Galaxy Watch 8 is easier to recommend now it starts at $260
News
How I Built a SOC 2-Compliant Cloud-Native Data Lake for Retirement Accounts | HackerNoon
How I Built a SOC 2-Compliant Cloud-Native Data Lake for Retirement Accounts | HackerNoon
Computing
Gemini Gets New Notebooks Feature That Syncs With NotebookLM
Gemini Gets New Notebooks Feature That Syncs With NotebookLM
News
Why some Pixel and Android users are experiencing alarm/timer confusion
Why some Pixel and Android users are experiencing alarm/timer confusion
News

You Might also Like

How I Built a SOC 2-Compliant Cloud-Native Data Lake for Retirement Accounts | HackerNoon
Computing

How I Built a SOC 2-Compliant Cloud-Native Data Lake for Retirement Accounts | HackerNoon

9 Min Read
Huawei unveils TruSense System for smart wearables · TechNode
Computing

Huawei unveils TruSense System for smart wearables · TechNode

4 Min Read
Chinese ride-hailer Didi invests in auto tech firm, sells more assets · TechNode
Computing

Chinese ride-hailer Didi invests in auto tech firm, sells more assets · TechNode

2 Min Read
Boosting Motivation at Scale in 2026: What Works and What Doesn’t | HackerNoon
Computing

Boosting Motivation at Scale in 2026: What Works and What Doesn’t | HackerNoon

6 Min Read
//

World of Software is your one-stop website for the latest tech news and updates, follow us now to get the news that matters to you.

Quick Link

  • Privacy Policy
  • Terms of use
  • Advertise
  • Contact

Topics

  • Computing
  • Software
  • Press Release
  • Trending

Sign Up for Our Newsletter

Subscribe to our newsletter to get our newest articles instantly!

World of SoftwareWorld of Software
Follow US
Copyright © All Rights Reserved. World of Software.
Welcome Back!

Sign in to your account

Lost your password?