By using this site, you agree to the Privacy Policy and Terms of Use.
Accept
World of SoftwareWorld of SoftwareWorld of Software
  • News
  • Software
  • Mobile
  • Computing
  • Gaming
  • Videos
  • More
    • Gadget
    • Web Stories
    • Trending
    • Press Release
Search
  • Privacy
  • Terms
  • Advertise
  • Contact
Copyright © All Rights Reserved. World of Software.
Reading: Breakthrough Apple study shows advanced reasoning AI doesn’t actually reason at all
Share
Sign In
Notification Show More
Font ResizerAa
World of SoftwareWorld of Software
Font ResizerAa
  • Software
  • Mobile
  • Computing
  • Gadget
  • Gaming
  • Videos
Search
  • News
  • Software
  • Mobile
  • Computing
  • Gaming
  • Videos
  • More
    • Gadget
    • Web Stories
    • Trending
    • Press Release
Have an existing account? Sign In
Follow US
  • Privacy
  • Terms
  • Advertise
  • Contact
Copyright © All Rights Reserved. World of Software.
World of Software > News > Breakthrough Apple study shows advanced reasoning AI doesn’t actually reason at all
News

Breakthrough Apple study shows advanced reasoning AI doesn’t actually reason at all

News Room
Last updated: 2025/06/09 at 10:23 AM
News Room Published 9 June 2025
Share
SHARE

With just a few days to go until WWDC 2025, Apple published a new AI study that could mark a turning point for the future of AI as we move closer to AGI.

Apple created tests that reveal reasoning AI models available to the public don’t actually reason. These models produce impressive results in math problems and other tasks because they’ve seen those types of tests during training. They’ve memorized the steps to solve problems or complete various tasks users might give to a chatbot.

But Apple’s own tests showed that these AI models can’t adapt to unfamiliar problems and figure out solutions. Worse, the AI tends to give up if it fails to solve a task. Even when Apple provided the algorithms in the prompts, the chatbots still couldn’t pass the tests.

Apple researchers didn’t use math problems to assess whether top AI models can reason. Instead, they turned to puzzles to test various models’ reasoning abilities.

Sign up for the most interesting tech & entertainment news out there.

By signing up, I agree to the Terms of Use and have reviewed the Privacy Notice.

The tests included puzzles like Tower of Hanoi, Checker Jumping, River Crossing, and Blocks World. Apple evaluated both regular large language models (LLMs) and large reasoning models (LRMs) using these puzzles, adjusting the difficulty levels.

The puzzles Apple gave to AI models. Image source: Apple Inc.

Apple tested LLMs like ChatGPT GPT-4, Claude 3.7 Sonnet, and DeepSeek V3. For LRMs, it tested ChatGPT o1, ChatGPT o3-mini, Gemini, Claude 3.7 Sonnet Thinking, and DeepSeek R1.

The scientists found that LLMs performed better than reasoning models when the difficulty was easy. LRMs did better at medium difficulty. Once the tasks reached the hard level, all models failed to complete them.

Apple observed that the AI models simply gave up on solving the puzzles at harder levels. Accuracy didn’t just decline gradually, it collapsed outright.

Accuracy comparison between LLMs and LRMs.
Accuracy comparison between LLMs and LRMs. Image source: Apple Inc.

The study suggests that even the best reasoning AI models don’t actually reason when faced with unfamiliar puzzles. The idea of “reasoning” in this context is misleading since these models aren’t truly thinking.

The Apple researchers added that experiments like theirs could lead to further research aimed at developing better reasoning AI models down the road.

Then again, many of us already suspected that reasoning AI models don’t actually think. AGI, or artificial general intelligence, would be the kind of AI that can figure things out on its own when facing new challenges.

I’ll also point out the obvious “grapes are sour” angle here. Apple’s study might be a breakthrough, sure. But it comes at a time when Apple Intelligence isn’t really competitive with ChatGPT, Gemini, and other mainstream AI models. Forget reasoning—Siri can’t even tell you what month it is. I’d choose ChatGPT o3 over Siri any day.

The timing of the study’s release is also questionable. Apple is about to host its annual WWDC 2025, and AI won’t be the main focus. Apple still trails OpenAI, Google, and other AI companies that have released commercial reasoning models. That’s not necessarily a bad thing, especially given that Apple continues to publish studies that showcase its own research and ideas in the field.

Still, Apple is basically saying that reasoning AI models aren’t as capable as people might believe, just days before an event where it won’t have any major AI advancements to announce. That’s fine too. I say this as a longtime iPhone user who still thinks Apple Intelligence has potential to catch up.

Accuracy and token use while trying to solve the puzzles.
Accuracy and token use while trying to solve the puzzles. Image source: Apple Inc.

The study’s findings are important, and I’m sure others will try to verify or challenge them. Some might even use these insights to improve their own reasoning models. Still, it does feel odd to see Apple downplay reasoning AI models right before WWDC.

I’ll also say this: as a ChatGPT o3 user, I’m not giving up on reasoning models even if they can’t truly think. o3 is my current go-to AI, and I like its responses more than the other ChatGPT options. It makes mistakes and hallucinates, but its “reasoning” still feels stronger than what basic LLMs can do.

Sign Up For Daily Newsletter

Be keep up! Get the latest breaking news delivered straight to your inbox.
By signing up, you agree to our Terms of Use and acknowledge the data practices in our Privacy Policy. You may unsubscribe at any time.
Share This Article
Facebook Twitter Email Print
Share
What do you think?
Love0
Sad0
Happy0
Sleepy0
Angry0
Dead0
Wink0
Previous Article NVIDIA VA-API Driver 0.0.14 Improves Compatibility & Fixes Various Issues
Next Article Best Cheap Phones 2025: Our favourite affordable handsets tested and ranked
Leave a comment

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

Stay Connected

248.1k Like
69.1k Follow
134k Pin
54.3k Follow

Latest News

Advanced AI suffers ‘complete accuracy collapse’ in face of complex problems, study finds
News
How to Start a Podcast in 2024: Step-by Step Guide
Computing
Waymo halts service in LA after robotaxis set ablaze
News
How AI Helps Regular People Build Useful Businesses | HackerNoon
Computing

You Might also Like

News

Advanced AI suffers ‘complete accuracy collapse’ in face of complex problems, study finds

5 Min Read
News

Waymo halts service in LA after robotaxis set ablaze

2 Min Read
News

Apple introduces Liquid Glass to take on Material 3 Expressive

2 Min Read
News

Browns players have signs of a ‘split’ after Shedeur Sanders and Flacco incident

3 Min Read
//

World of Software is your one-stop website for the latest tech news and updates, follow us now to get the news that matters to you.

Quick Link

  • Privacy Policy
  • Terms of use
  • Advertise
  • Contact

Topics

  • Computing
  • Software
  • Press Release
  • Trending

Sign Up for Our Newsletter

Subscribe to our newsletter to get our newest articles instantly!

World of SoftwareWorld of Software
Follow US
Copyright © All Rights Reserved. World of Software.
Welcome Back!

Sign in to your account

Lost your password?