SAN FRANCISCO – OpenAI on December 20 unveiled a new artificial intelligence system, OpenAI o3, designed to “reason” through problems in math, science and computer programming.
The company said the system, which it currently shares only with safety and security testers, outperformed the industry’s leading AI technologies on standardized benchmark tests that assess skills in math, science, coding and logic.
The new system is the successor to o1, the reasoning system that the company introduced in 2024. OpenAI o3 was more than 20 percent more accurate than o1 in a range of common programming tasks, the company said, and it even outperformed chief scientist Jakub. Pachocki, during a competitive programming test.
OpenAI said it plans to roll out the technology to individuals and businesses in early 2025.
“This model is incredible in terms of programming,” said Sam Altman, CEO of OpenAI, during an online presentation to unveil the new system. He added that at least one OpenAI programmer could still beat the system in this test.
The new technology is part of a broader effort to build AI systems that can perform complex tasks. This week, Google unveiled a similar technology, called Gemini 2.0 Flash Thinking Experimental, and shared it with a small number of testers.
These two companies and others strive to build systems that can solve a problem carefully and logically through a series of steps, with each step building on the previous one. These technologies can be useful for computer programmers who use AI systems to write code or for students who seek help from computerized tutors in areas such as math and science.
With the debut of the ChatGPT chatbot in late 2022, OpenAI showed that machines could handle requests more like humans, answer questions, write term papers and generate computer code. But the responses were sometimes flawed.
ChatGPT learned its skills by analyzing vast amounts of text from the Internet, including news articles, books, computer programs and chat logs. By pointing out patterns, it learned to generate text itself.
Because the Internet is filled with untruthful information, technology has learned to repeat the same untruths. Sometimes it made something up – a phenomenon scientists called “hallucination.”
OpenAI built its new system using what’s called reinforcement learning. Through this process, a system can learn behavior through extensive trial and error. For example, by working on different math problems, he can learn which techniques lead to the correct answer and which do not. If it repeats this process with a very large number of problems, it can identify patterns.
Although systems like o3 are designed for reasoning, they are based on the same core technology as the original ChatGPT. That means they can still do things wrong or hallucinate.
The system is designed to ‘think’ problems. It tries to break down the problem into pieces and look for ways to solve it, which can require much larger amounts of computing power than is needed for regular chatbots. That can also be expensive.
In December, OpenAI began selling OpenAI o1 to individuals and businesses. One service, aimed at professionals, cost $200 per month.
The New York Times sued OpenAI and Microsoft in December for copyright infringement of news content related to AI systems. The companies have denied the claims. NY TIMES
Participate ST’s Telegram channel and get the latest breaking news.