By using this site, you agree to the Privacy Policy and Terms of Use.
Accept
World of SoftwareWorld of SoftwareWorld of Software
  • News
  • Software
  • Mobile
  • Computing
  • Gaming
  • Videos
  • More
    • Gadget
    • Web Stories
    • Trending
    • Press Release
Search
  • Privacy
  • Terms
  • Advertise
  • Contact
Copyright © All Rights Reserved. World of Software.
Reading: ‘Holy shit’: Gemini 3 is winning the AI race — for now
Share
Sign In
Notification Show More
Font ResizerAa
World of SoftwareWorld of Software
Font ResizerAa
  • Software
  • Mobile
  • Computing
  • Gadget
  • Gaming
  • Videos
Search
  • News
  • Software
  • Mobile
  • Computing
  • Gaming
  • Videos
  • More
    • Gadget
    • Web Stories
    • Trending
    • Press Release
Have an existing account? Sign In
Follow US
  • Privacy
  • Terms
  • Advertise
  • Contact
Copyright © All Rights Reserved. World of Software.
World of Software > News > ‘Holy shit’: Gemini 3 is winning the AI race — for now
News

‘Holy shit’: Gemini 3 is winning the AI race — for now

News Room
Last updated: 2025/11/24 at 10:23 AM
News Room Published 24 November 2025
Share
‘Holy shit’: Gemini 3 is winning the AI race — for now
SHARE

When an AI model release immediately spawns memes and treatises declaring the rest of the industry cooked, you know you’ve got something worth dissecting.

Google’s Gemini 3 was released Tuesday to widespread fanfare. The company called the model a “new era of intelligence,” integrating it into Google Search on day one for the first time. It’s blown past OpenAI and other competitors’ products on a range of benchmarks and is topping the charts on LMArena, a crowdsourced AI evaluation platform that’s essentially the Billboard Hot 100 of AI model ranking. Within 24 hours of its launch, more than one million users tried Gemini 3 in Google AI Studio and the Gemini API, per Google. “From a day one adoption standpoint, [it’s] the best we’ve seen from any of our model releases,” Google DeepMind’s Logan Kilpatrick, who is product lead for Google’s AI Studio and the Gemini API, told The Verge.

Even OpenAI CEO Sam Altman and xAI CEO Elon Musk publicly congratulated the Gemini team on a job well done. And Salesforce CEO Marc Benioff wrote that after using ChatGPT every day for three years, spending two hours on Gemini 3 changed everything: “Holy shit … I’m not going back. The leap is insane — reasoning, speed, images, video… everything is sharper and faster. It feels like the world just changed, again.”

“This is more than a leaderboard shuffle,” said Wei-Lin Chiang, cofounder and CTO of LMArena. Chiang told The Verge that Gemini 3 Pro holds a “clear lead” in occupational categories including coding, match, and creative writing, and its agentic coding abilities “in many cases now surpass top coding models like Claude 4.5 and GPT-5.1.” It also got the top spot on visual comprehension and was the first model to surpass a ~1500 score on the platform’s text leaderboard.

The new model’s performance, Chiang said, “illustrates that the AI arms race is being shaped by models that can reason more abstractly, generalize more consistently, and deliver dependable results across an increasingly diverse set of real-world evaluations.”

Alex Conway, principal software engineer at DataRobot, told The Verge that one of Gemini 3’s most notable advancements was on a specific reasoning benchmark called ARC-AGI-2. Gemini scored almost twice as high as OpenAI’s GPT-5 Pro while running at one-tenth of the cost per task, he said, which is “really challenging the notion that these models are plateauing.” And on the SimpleQA benchmark — which involves simple questions and answers on a broad range of topics, and requires a lot of niche knowledge — Gemini 3 Pro scored more than twice as high as OpenAI’s GPT-5.1, Conway flagged. “Use case-wise, it’ll be great for a lot more niche topics and diving deep into state-of-the-art research and scientific fields,” he said.

But leaderboards aren’t everything. It’s possible — and in the high-pressure AI world, tempting — to train a model for narrow benchmarks rather than general-purpose success. So to really know how well a system is doing, you have to rely on real-world testing, anecdotal experience, and complex use cases in the wild.

The Verge spoke with professionals across disciplines who use AI every day for work. The consensus: Gemini 3 looks impressive, and it does a great job on a wide breadth of tasks — but when it comes to edge cases and niche aspects of certain industries, many professionals won’t be replacing their current models with it anytime soon.

The majority of people The Verge spoke with plan to continue to use Anthropic’s Claude for their coding needs, despite Gemini 3’s advancements in that space. Some also said that Gemini 3 isn’t optimal on the user interaction front. Tim Dettmers, assistant professor at Carnegie Mellon University and a research scientist at Ai2, said that though it’s a “great model,” it’s a bit raw when it comes to UX, meaning “it doesn’t follow instructions precisely.”

Tulsee Doshi, Google DeepMind’s senior director of product management for Gemini and Gen Media, told The Verge that the company prioritized bringing Gemini 3 to a variety of Google products in a “very real way.” When asked about the instruction-following concerns, she said it’s been helpful to see “where folks are hitting some of the sticking points.”

She also said that since the Pro model is the first release in the Gemini 3 suite, later models will help “round out that concern.”

Joel Hron, CTO of Thomson Reuters, said that the company has its own internal benchmarks it’s developed to rank both its internal models and public ones on the areas that are most relevant to their work — like comparing two documents up to several hundreds of pages in length, interpreting a long document, understanding legal contracts, and reasoning in the legal and tax spaces. He said that so far, Gemini 3 has performed strongly across all of them and is “a significant jump up from where Gemini 2.5 was.” It also outperforms several of Anthropic’s and OpenAI’s models right now in some of those areas.

Louis Blankemeier, cofounder and CEO of Cognita, a radiology AI startup, said that in terms of “pure numbers” Gemini 3 is “super exciting.” But, he said, “we still need some time to figure out what the real-world utility of this model is.” For more general domains, Blankemeier said, Gemini 3 is a star, but when he played around with it for radiology, it struggled with correctly identifying subtle rib fractures on chest X-rays, as well as uncommon or rare conditions. He calls radiology akin to self-driving cars in many ways, with a lot of edge cases — so a newer, more powerful model may still not be as effective as an older one that’s been refined and trained on custom data over time. “The real world is just so much more difficult,” he said.

Similarly, Matt Hoffman, head of AI at Longeye, a company providing AI tools for law enforcement investigations, sees promise in the Gemini 3 Pro-powered Nano Banana Pro image generator. Image generators allow Longeye to create convincing synthetic datasets for testing, letting it keep real, sensitive investigation data secure. But although the benchmarks are impressive, they may not map to the company’s actual use cases. “I’m not confident Longeye could swap out a model we’re using in production for Gemini 3 and see immediate improvements,” he said.

Other companies also say they’re excited about Gemini — but not necessarily using it to replace everything else. Built, a construction lending startup, currently uses a mix of foundational models from Google, Anthropic, OpenAI, and others to analyze construction draw requests — a package of documents often sent to a construction lender, like invoices and proof of work done, requesting that funds be paid. This requires multimodal analysis of text and images, plus a large context window for the main agent delegating tasks to the others, VP of engineering Thomas Schlegel told The Verge. That’s part of what Google promises with Gemini 3, so the company is currently exploring switching it out for 2.5.

“In the past we’ve found Gemini to be the best at all-purpose tasks, and 3 looks to be a big step forward along those same lines,” Schlegel said. “It’s everything we love about Gemini on steroids.” But he doesn’t yet think it will replace all the other models, including Claude for coding tasks and OpenAI products for business reasoning.

For Tanmai Gopal, cofounder and CEO of AI agent platform PromptQL, the stir Gemini 3 has caused is valid, but “it’s definitely not the end of anything” for Google’s competitors. AI models are becoming better and cheaper, and since they’re on such quick release cycles, “one is always ahead of the pack for a period of time.” (For instance, the day after Gemini 3 came out, OpenAI released GPT-5.1-Codex-Max, an update to a week-old model, ostensibly to challenge Gemini 3 on a few coding benchmarks.)

Gopal said PromptQL is still working on internal evaluations to decide how, if at all, the team’s model choices will change, but “initial results aren’t necessarily showing something drastically better” than their current lineup. He said his current preference is Claude for code generation, ChatGPT for web search, and GPT-5 Pro for “deep brainstorming,” but he may incorporate Gemini 3 as a default model, since it’s “probably best-in-class for consumer tasks across creative, text, [and] image.”

And like virtually every model, Gemini 3 has had moments of what I’ll dub “robotic hand syndrome” — when an AI system does something complex with flying colors but gets gobsmacked by the simplest query, akin to the robotic hands of yesteryear having trouble gripping a soda can. Famed researcher Andrej Karpathy, who was a founding member of OpenAI and former director of AI at Tesla, wrote on X after testing Gemini 3 that he “had a positive early impression yesterday across personality, writing, vibe coding, humor, etc., very solid daily driver potential, clearly a tier 1 LLM,” but he noted that the model refused to believe him when he said it was 2025 and later said it had forgotten to turn on Google Search. (He ascertained that in early testing, he may have been given a model with a stale system prompt.)

In The Verge’s own experience testing Gemini 3, we found it “delivers reasonably well — with caveats.” It likely won’t stay on top forever, but it’s an unmistakable step up for the company.

“You’re sort of in this leapfrog game from model to model, month to month, when a new one drops,” Hron said. “But what stuck to me about Google’s release is it makes substantial improvements across many dimensions of models — so it’s not like it just got better at coding or it just got better at reasoning … It really, across the board, got a good bit better.”

Follow topics and authors from this story to see more like this in your personalized homepage feed and to receive email updates.

  • Hayden Field

    Hayden Field

    Posts from this author will be added to your daily email digest and your homepage feed.

    See All by Hayden Field

  • AI

    Posts from this topic will be added to your daily email digest and your homepage feed.

    See All AI

  • Google

    Posts from this topic will be added to your daily email digest and your homepage feed.

    See All Google

  • OpenAI

    Posts from this topic will be added to your daily email digest and your homepage feed.

    See All OpenAI

  • Report

    Posts from this topic will be added to your daily email digest and your homepage feed.

    See All Report

  • Tech

    Posts from this topic will be added to your daily email digest and your homepage feed.

    See All Tech

Sign Up For Daily Newsletter

Be keep up! Get the latest breaking news delivered straight to your inbox.
By signing up, you agree to our Terms of Use and acknowledge the data practices in our Privacy Policy. You may unsubscribe at any time.
Share This Article
Facebook Twitter Email Print
Share
What do you think?
Love0
Sad0
Happy0
Sleepy0
Angry0
Dead0
Wink0
Previous Article M-KOPA crosses .6 billion in loans as PAYGO market expands M-KOPA crosses $1.6 billion in loans as PAYGO market expands
Next Article These OLED TVs Include the 7 Best TVs We’ve Ever Seen. But Which Is Right for You? These OLED TVs Include the 7 Best TVs We’ve Ever Seen. But Which Is Right for You?
Leave a comment

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

Stay Connected

248.1k Like
69.1k Follow
134k Pin
54.3k Follow

Latest News

'Bel-Air': Jabari Banks and Cast Talk the Final Season
'Bel-Air': Jabari Banks and Cast Talk the Final Season
News
AMD SDCIAE Working Its Way Into The Linux 6.19 Kernel
AMD SDCIAE Working Its Way Into The Linux 6.19 Kernel
Computing
The Analogue3D Is a Retro Gamer’s Dream
The Analogue3D Is a Retro Gamer’s Dream
Gadget
Walmart Black Friday Week Deals Are Here With Deep Discounts on iRobot Vacuums, PlayStation 5, Samsung TVs, and More
Walmart Black Friday Week Deals Are Here With Deep Discounts on iRobot Vacuums, PlayStation 5, Samsung TVs, and More
News

You Might also Like

'Bel-Air': Jabari Banks and Cast Talk the Final Season
News

'Bel-Air': Jabari Banks and Cast Talk the Final Season

7 Min Read
Walmart Black Friday Week Deals Are Here With Deep Discounts on iRobot Vacuums, PlayStation 5, Samsung TVs, and More
News

Walmart Black Friday Week Deals Are Here With Deep Discounts on iRobot Vacuums, PlayStation 5, Samsung TVs, and More

35 Min Read
Black Friday Streaming Deals Include Big Savings on Disney+, Hulu, Apple TV, and More
News

Black Friday Streaming Deals Include Big Savings on Disney+, Hulu, Apple TV, and More

6 Min Read
37 of the best Black Friday Apple deals in 2025: Record-low prices on AirPods and MacBooks
News

37 of the best Black Friday Apple deals in 2025: Record-low prices on AirPods and MacBooks

8 Min Read
//

World of Software is your one-stop website for the latest tech news and updates, follow us now to get the news that matters to you.

Quick Link

  • Privacy Policy
  • Terms of use
  • Advertise
  • Contact

Topics

  • Computing
  • Software
  • Press Release
  • Trending

Sign Up for Our Newsletter

Subscribe to our newsletter to get our newest articles instantly!

World of SoftwareWorld of Software
Follow US
Copyright © All Rights Reserved. World of Software.
Welcome Back!

Sign in to your account

Lost your password?