A Maneuver That Aims To Cut Ground On Google's Gemini 3

In the race to lead the development of artificial intelligence, the pace has become a succession of linked movements. On November 12, GPT-5.1 arrived, an update aimed at polishing the experience and keeping users satisfied. Just a few days later, on November 18, Google responded with Gemini 3, an evolution of its flagship model that left very good feelings among those who began to try it.

As a result of that launch, rumors began to circulate: the startup led by Sam Altman had activated a supposed “code red” when seeing how its direct rival was taking advantage. And this seems to be the first result of that internal movement. Not even a month has passed since the previous update of its flagship model and GPT-5.2 is here. The promise here is to solve some known problems, decrease latency and gain reasoning.

An evolution within the 5 series. GPT-5.2 appears as a version designed to boost knowledge work, with advances in coding, vision, document analysis and multi-step projects. OpenAI incorporates it as the direct evolution of GPT-5.1, not as a generational leap. According to the company, the update improves the management of long contexts, reduces errors and increases the ability to coordinate tools.

More differentiated layers of use. The three usual variants are somewhat more differentiated in their use, not because of new functions, but because of the way in which they integrate the improvements announced by OpenAI. Thinking absorbs much of the progress in reasoning, handling large documents, and coordinating tools. Pro raises the bar in specialized tasks, especially in code and technical calculations. Instant, for its part, benefits from more stable explanations and a reduction in errors. The result is a clearer separation between everyday tasks, complex jobs and expert needs.

A visible improvement in multiple evaluations. OpenAI presents figures that show GPT-5.2 ahead of GPT-5.1 in very different areas, from scientific reasoning to programming and knowledge tasks. In GDPval, the assessment that measures well-specified jobs in 44 occupations, the model achieves 70.9% wins or draws against human professionals. In GPQA Diamond it rises to 92.4% and in AIME 2025 it achieves 100%. The trend is repeated in technical tests such as FrontierMath or ARC-AGI, where performance also increases compared to the previous version.

Gpt 5 2 Performance

The improvements are seen when moving from figures to day-to-day tasks. In internal evaluations of financial analysts’ own work, such as three-state modeling or leveraged buyout simulations, Thinking raises its average score from 59.1% to 68.4%. The company also promises advances in generating spreadsheets and presentations with a clearer structure. In addition, companies such as Notion, Box, Shopify or Harvey, according to OpenAI, have observed improvements in long-range reasoning and in the use of tools in their own workflows. If these results are consolidated in real environments, they could reduce manual work in processes that require precision and consistency.

A more stable environment for developers. GPT-5.2 Thinking, they say, achieves higher performance in demanding software tests, especially those that evaluate the ability to apply complete and consistent changes in real projects. The company indicates that the model better coordinates sequences of steps, something that is reflected in internal evaluations and feedback from platforms such as Windsurf or Charlie Labs.

Fewer errors in sight. OpenAI claims that GPT-5.2 Thinking reduces the frequency of responses with errors by around 30% relative to GPT-5.1. This is an improvement that they associate with more stable reasoning and a greater ability to detect errors before generating the final response. The company also points out advances in the management of sensitive situations, such as conversations linked to emotional distress or mental health. Although he remembers that the model is still imperfect, he maintains that these adjustments contribute to a more reliable experience in everyday use.

There are a lot of people going to libraries to look for books that don't exist: an AI invented them

Where you can use GPT-5.2 today. OpenAI indicates that GPT-5.2 will begin rolling out on ChatGPT for paid plans including Plus, Pro, Go, Business, and Enterprise. In the API, GPT-5.2 Thinking is available as gpt-5.2 and the Instant version appears as gpt-5.2-chat-latest. The company has also promised to keep GPT-5.1 for three months on ChatGPT before removing it from paid plans. In terms of pricing, GPT-5.2 stands at $1.75 per million input tokens and $14 per million output tokens, more expensive than GPT-5.1, although OpenAI maintains that its greater efficiency reduces the final cost in demanding tasks.

Images | OpenAI

In WorldOfSoftware | OpenAI knows that it needs to continue generating memes and virals. That’s why she’s willing to pay Disney a lot of money for her content.