Google announced its widely anticipated Gemini 3 model Tuesday. By many key metrics, it appears to be more capable than the other big generative AI models on the market.
In a show of confidence in the performance (and safety) of the new model, Google is making one variant of Gemini—Gemini 3 Pro—available to everyone via the Gemini app starting now. It’s also making the same model a part of its core search service for subscribers.
The new model topped the scores of the much-cited LMArena benchmark, a crowdsourced preference of various top models based on head-to-head responses to identical prompts. In the super-difficult Humanity’s Last Exam benchmark test, which measured reasoning and knowledge, the Gemini 3 Pro scored 37.4% compared to GPT-5 Pro’s 31.6%. Gemini 3 also topped a range of other benchmarks measuring everything from reasoning to academic knowledge to math to tool use and agent functions.
Gemini has been a multimodal model from the start, meaning that it can understand and reason about not just language, but images, audio, video, and code—all at the same time. This capability has been steadily improving since the first Gemini, and Gemini 3 reached state-of-the-art performance on the MMMU-Pro benchmark, which measures how well a model handles college-level and professional-level reasoning across text and images. It also topped the Video-MMMU benchmark, which measures the ability to reason over details of video footage. For example, the Gemini model might ingest a number of YouTube videos, then create a set of flashcards based on what it learned.
Gemini also scored high on its ability to create computer code. That’s why it was a good time for the company to launch a new Cursor-like coding agent called Antigravity. Software development has proven to be among the first business functions in which generative AI has had a measurably positive impact.
Benchmarks are telling, but as the response to OpenAI’s GPT-5.1 showed, the “feel” or “personality” of a model matters to users (many users thought GPT-5 was a dramatic personality downgrade from GPT-4o). Google DeepMind CEO Demis Hassabis seemed to acknowledge this in a tweet Tuesday. “(B)eyond the benchmarks it’s been by far my favorite model to use for its style and depth, and what it can do to help with everyday tasks.” Of course users will have their own say about Gemini 3’s communication style, and how well it adapts to user preferences and work habits.
With the release of Google’s third-generation generative AI model, it’s a good time to look at the wider context of the race to build the dominant AI models of the 21st century. The contest, remember, is only a few years old. So far, OpenAI’s models have spent the most time atop the benchmark rankings, and, on the strength of ChatGPT, have garnered most of the attention of all the players in the emerging AI industry.
History on its side?
From the start, Google has enjoyed some distinct advantages. It’s been investing in AI talent and research for decades, starting long before OpenAI became a company in 2015. It began developing machine learning techniques for understanding search intent, defining page rank, and for placing ads as far back as 2001. It bought London-based AI research lab DeepMind back in 2014, and DeepMind has been responsible for some of Google’s biggest AI accomplishments (AlphaGo, AlphaFold, Gemini models).
The big research breakthroughs that enabled the current wave of generative AI models took place at Google. In 2017, Google researchers invented the transformer language model architecture that allowed LLMs to learn much more from their training data than earlier language models. The following year Google used the transformer architecture to build its BERT language model, which led directly to the GPT models that power ChatGPT. In fact, the search giant developed an AI chatbot well before OpenAI did, but was conflicted about releasing it or infusing it into its other products because of legal and business model concerns.
All the data
Google has access to more and better-quality training data than any other AI company. It’s been indexing most of the information on the web since 1998. It also owns huge amounts of information such as local business data, mapping data, and customer reviews, which can be used to train AI models or augment their output (within search results, for example).
Generative models are just now gaining the ability to learn about the world from video footage in the same way that models learn from large amounts of text. With YouTube, Google has access to mountains of it, and its AI models could gain an increasing intelligence advantage by training on it.
As AI begins to manage more and more of our personal and work tasks, Google’s advantages in experience, talent, and data and other resources may help sustain Gemini’s state-of-the-art status and overall functionality in the years to come.
high stakes
This is more than about which company can sell the most API access to its models or subscriptions to a chatbot. As models like Gemini, Claude, and GPT-5 may eventually become smarter, perhaps far smarter, than humans at almost any task. The company with the models that reaches that level, also called “artificial general intelligence” (AGI) may dominate the marketplace for consumer and business AI in the same way Google has dominated search in the first decades of this century. With tech companies already spending hundreds of billions to build the infrastructure for their AI businesses, the pressure is mounting to push harder and faster on the development of new generations of AI models.
The final deadline for Fast Company’s World Changing Ideas Awards is Friday, December 12, at 11:59 pm PT. Apply today.
