By using this site, you agree to the Privacy Policy and Terms of Use.
Accept
World of SoftwareWorld of SoftwareWorld of Software
  • News
  • Software
  • Mobile
  • Computing
  • Gaming
  • Videos
  • More
    • Gadget
    • Web Stories
    • Trending
    • Press Release
Search
  • Privacy
  • Terms
  • Advertise
  • Contact
Copyright © All Rights Reserved. World of Software.
Reading: Apple’s Illusion of Thinking Paper Explores Limits of Large Reasoning Models
Share
Sign In
Notification Show More
Font ResizerAa
World of SoftwareWorld of Software
Font ResizerAa
  • Software
  • Mobile
  • Computing
  • Gadget
  • Gaming
  • Videos
Search
  • News
  • Software
  • Mobile
  • Computing
  • Gaming
  • Videos
  • More
    • Gadget
    • Web Stories
    • Trending
    • Press Release
Have an existing account? Sign In
Follow US
  • Privacy
  • Terms
  • Advertise
  • Contact
Copyright © All Rights Reserved. World of Software.
World of Software > News > Apple’s Illusion of Thinking Paper Explores Limits of Large Reasoning Models
News

Apple’s Illusion of Thinking Paper Explores Limits of Large Reasoning Models

News Room
Last updated: 2025/07/01 at 9:40 AM
News Room Published 1 July 2025
Share
SHARE

Apple Machine Learning Research published a paper titled The Illusion of Thinking, which investigates the abilities of Large Reasoning Models (LRMs) on a set of puzzles. As the complexity of the puzzles increases, the researchers found that LRMs encounter a “collapse” threshold where the models reduce their reasoning effort, indicating a limit to the models’ scalability.

For their experiments, Apple researchers chose four puzzle problems, including Tower of Hanoi, and a variety of LRMs and standard LLMs, including o3-mini and DeepSeek-R1. Each puzzle’s complexity could be varied; for example, the Tower of Hanoi puzzle can have a variable number of disks. They found that as complexity increased, model behavior went through three regimes: in the first, with simple problems, both reasoning and non-reasoning models performed similarly well. In the second, medium complexity regime, the reasoning models with their Chain-of-Thought (CoT) inference performed better than LLMs. But in the high complexity regime, both groups’ performance “collapsed to zero.” According to Apple, 

In this study, we probe the reasoning mechanisms of frontier LRMs through the lens of problem complexity….Our findings reveal fundamental limitations in current models: despite sophisticated self-reflection mechanisms, these models fail to develop generalizable reasoning capabilities beyond certain complexity thresholds….These insights challenge prevailing assumptions about LRM capabilities and suggest that current approaches may be encountering fundamental barriers to generalizable reasoning.

LRMs such as o3 and DeepSeek-R1 are LLMs that have been fine-tuned to generate step-by-step instructions for itself before producing a response to users; in essence, the models “think out loud” to produce better answers. This allows the models to outperform their “standard” LLM counterparts on many tasks, especially coding, mathematics, and science benchmarks.

As part of their experiments, the Apple team analyzed these reasoning traces generated by the models. They noted that for simpler problems, the models would often “overthink:” the correct solution would appear early in the trace, but the models would continue to explore incorrect ideas. In medium complexity problems, however, the models would explore incorrect solutions before finding the correct one.

Apple’s paper sparked a wide debate in the AI community. Gary Marcus, a cognitive scientist and critic of the current state of AI, wrote about the research, saying:

What the Apple paper shows, most fundamentally, regardless of how you define [Artificial General Intelligence (AGI)], is that LLMs are no substitute for good well-specified conventional algorithms. (They also can’t play chess as well as conventional algorithms, can’t fold proteins like special-purpose neurosymbolic hybrids, can’t run databases as well as conventional databases, etc.)

Open source developer and AI commentator Simon Willison pointed out:

I’m not interested in whether or not LLMs are the “road to AGI”. I continue to care only about whether they have useful applications today, once you’ve understood their limitations. Reasoning LLMs are a relatively new and interesting twist on the genre. They are demonstrably able to solve a whole bunch of problems that previous LLMs were unable to handle, hence why we’ve seen a rush of new models from OpenAI and Anthropic and Gemini and DeepSeek and Qwen and Mistral….They’re already useful to me today, whether or not they can reliably solve the Tower of Hanoi….

Apple acknowledges several limitations of their research, noting in particular that their experiments mostly relied on “black box” API calls, leaving them unable to examine the inner state of the models. They also agree that the use of puzzles means that their conclusions may not generalize to all reasoning domains.

Sign Up For Daily Newsletter

Be keep up! Get the latest breaking news delivered straight to your inbox.
By signing up, you agree to our Terms of Use and acknowledge the data practices in our Privacy Policy. You may unsubscribe at any time.
Share This Article
Facebook Twitter Email Print
Share
What do you think?
Love0
Sad0
Happy0
Sleepy0
Angry0
Dead0
Wink0
Previous Article Stablecoins could be key in PAPSS’ new blockchain platform
Next Article Xpeng Motors to invest $413 million in flying cars this year: CEO · TechNode
Leave a comment

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

Stay Connected

248.1k Like
69.1k Follow
134k Pin
54.3k Follow

Latest News

DORA Regulation Explained – Plus a Free Compliance Checklist | HackerNoon
Computing
1MinAI is an all-in-one AI tool—get it for life for just $40
News
Roborock V20 and G20S, robotic vacuums for smart home cleaning · TechNode
Computing
Threads now has DMs – here’s how to send direct messages
News

You Might also Like

News

1MinAI is an all-in-one AI tool—get it for life for just $40

2 Min Read
News

Threads now has DMs – here’s how to send direct messages

3 Min Read
News

I test smart air conditioners for a living — here’s my top 3 picks to make it through a heat wave

6 Min Read
News

Google Photos is getting a brand-new Photo view, but Android users will have to wait

3 Min Read
//

World of Software is your one-stop website for the latest tech news and updates, follow us now to get the news that matters to you.

Quick Link

  • Privacy Policy
  • Terms of use
  • Advertise
  • Contact

Topics

  • Computing
  • Software
  • Press Release
  • Trending

Sign Up for Our Newsletter

Subscribe to our newsletter to get our newest articles instantly!

World of SoftwareWorld of Software
Follow US
Copyright © All Rights Reserved. World of Software.
Welcome Back!

Sign in to your account

Lost your password?