In September, OpenAI unveiled a new version of ChatGPT designed to reason through tasks involving math, science and computer programming. Unlike previous versions of the chatbot, this new technology could spend time “thinking” through complex problems before settling on an answer.
Soon, the company said its new reasoning technology had outperformed the industry’s leading systems on a series of tests that track the progress of artificial intelligence.
Now other companies, like Google, Anthropic and China’s DeepSeek, offer similar technologies.
But can A.I. actually reason like a human? What does it mean for a computer to think? Are these systems really approaching true intelligence?
Here is a guide.
What does it mean when an A.I. system reasons?
Reasoning just means that the chatbot spends some additional time working on a problem.
“Reasoning is when the system does extra work after the question is asked,” said Dan Klein, a professor of computer science at the University of California, Berkeley, and chief technology officer of Scaled Cognition, an A.I. start-up.
It may break a problem into individual steps or try to solve it through trial and error.
The original ChatGPT answered questions immediately. The new reasoning systems can work through a problem for several seconds — or even minutes — before answering.
Can you be more specific?
In some cases, a reasoning system will refine its approach to a question, repeatedly trying to improve the method it has chosen. Other times, it may try several different ways of approaching a problem before settling on one of them. Or it may go back and check some work it did a few seconds before, just to see if it was correct.
Basically, the system tries whatever it can to answer your question.
This is kind of like a grade school student who is struggling to find a way to solve a math problem and scribbles several different options on a sheet of paper.
What sort of questions require an A.I. system to reason?
It can potentially reason about anything. But reasoning is most effective when you ask questions involving math, science and computer programming.
How is a reasoning chatbot different from earlier chatbots?
You could ask earlier chatbots to show you how they had reached a particular answer or to check their own work. Because the original ChatGPT had learned from text on the internet, where people showed how they had gotten to an answer or checked their own work, it could do this kind of self-reflection, too.
But a reasoning system goes further. It can do these kinds of things without being asked. And it can do them in more extensive and complex ways.
Companies call it a reasoning system because it feels as if it operates more like a person thinking through a hard problem.
Why is A.I. reasoning important now?
Companies like OpenAI believe this is the best way to improve their chatbots.
For years, these companies relied on a simple concept: The more internet data they pumped into their chatbots, the better those systems performed.
But in 2024, they used up almost all of the text on the internet.
That meant they needed a new way of improving their chatbots. So they started building reasoning systems.
How do you build a reasoning system?
Last year, companies like OpenAI began to lean heavily on a technique called reinforcement learning.
Through this process — which can extend over months — an A.I. system can learn behavior through extensive trial and error. By working through thousands of math problems, for instance, it can learn which methods lead to the right answer and which do not.
Researchers have designed complex feedback mechanisms that show the system when it has done something right and when it has done something wrong.
“It is a little like training a dog,” said Jerry Tworek, an OpenAI researcher. “If the system does well, you give it a cookie. If it doesn’t do well, you say, ‘Bad dog.’”
(The New York Times sued OpenAI and its partner, Microsoft, in December for copyright infringement of news content related to A.I. systems.)
Does reinforcement learning work?
It works pretty well in certain areas, like math, science and computer programming. These are areas where companies can clearly define the good behavior and the bad. Math problems have definitive answers.
Reinforcement learning doesn’t work as well in areas like creative writing, philosophy and ethics, where the distinction between good and bad is harder to pin down. Researchers say this process can generally improve an A.I. system’s performance, even when it answers questions outside math and science.
“It gradually learns what patterns of reasoning lead it in the right direction and which don’t,” said Jared Kaplan, chief science officer at Anthropic.
Are reinforcement learning and reasoning systems the same thing?
No. Reinforcement learning is the method that companies use to build reasoning systems. It is the training stage that ultimately allows chatbots to reason.
Do these reasoning systems still make mistakes?
Absolutely. Everything a chatbot does is based on probabilities. It chooses a path that is most like the data it learned from — whether that data came from the internet or was generated through reinforcement learning. Sometimes it chooses an option that is wrong or does not make sense.
Is this a path to a machine that matches human intelligence?
A.I. experts are split on this question. These methods are still relatively new, and researchers are still trying to understand their limits. In the A.I. field, new methods often progress very quickly at first, before slowing down.