By using this site, you agree to the Privacy Policy and Terms of Use.
Accept
World of SoftwareWorld of SoftwareWorld of Software
  • News
  • Software
  • Mobile
  • Computing
  • Gaming
  • Videos
  • More
    • Gadget
    • Web Stories
    • Trending
    • Press Release
Search
  • Privacy
  • Terms
  • Advertise
  • Contact
Copyright © All Rights Reserved. World of Software.
Reading: AI research summaries ‘exaggerate findings’, study warns
Share
Sign In
Notification Show More
Font ResizerAa
World of SoftwareWorld of Software
Font ResizerAa
  • Software
  • Mobile
  • Computing
  • Gadget
  • Gaming
  • Videos
Search
  • News
  • Software
  • Mobile
  • Computing
  • Gaming
  • Videos
  • More
    • Gadget
    • Web Stories
    • Trending
    • Press Release
Have an existing account? Sign In
Follow US
  • Privacy
  • Terms
  • Advertise
  • Contact
Copyright © All Rights Reserved. World of Software.
World of Software > Software > AI research summaries ‘exaggerate findings’, study warns
Software

AI research summaries ‘exaggerate findings’, study warns

News Room
Last updated: 2025/08/05 at 12:14 PM
News Room Published 5 August 2025
Share
SHARE

AI tools overhype research findings far more often than humans, with a study suggesting the newest bots are the worst offenders – particularly when they are specifically instructed not to exaggerate.

Dutch and British researchers have found that AI summaries of scientific papers are much more likely than the original authors or expert reviewers to “overgeneralise” the results.

The analysis, reported in the journal Royal Society Open Science, suggests that AI summaries – purportedly designed to help spread scientific knowledge by rephrasing it in “easily understandable language” – tend to ignore “uncertainties, limitations and nuances” in the research by “omitting qualifiers” and “oversimplifying” the text.

This is particularly “risky” when applied to medical research, the report warns. “If chatbots produce summaries that overlook qualifiers [about] the generalisability of clinical trial results, practitioners who rely on these chatbots may prescribe unsafe or inappropriate treatments.”

The team analysed almost 5,000 AI summaries of 200 journal abstracts and 100 full articles. Topics ranged from caffeine’s influence on irregular heartbeats and the benefits of bariatric surgery in reducing cancer risk, to the impacts of disinformation and government communications on residents’ behaviour and people’s beliefs about climate change.

Summaries produced by “older” AI apps – such as OpenAI’s GPT-4 and Meta’s Llama 2, both released in 2023 – proved about 2.6 times as likely as the original abstracts to contain generalised conclusions.

The likelihood of generalisation increased to nine times in summaries by ChatGPT−4o, which was released last May, and 39 times in synopses by Llama 3.3, which emerged in December.

Instructions to “stay faithful to the source material” and “not introduce any inaccuracies” produced the opposite effect, with the summaries proving about twice as likely to contain generalised conclusions as those generated when bots were simply asked to “provide a summary of the main findings”.

This suggested that generative AI may be vulnerable to “ironic rebound” effects, where instructions not to think about something – for example, “a pink elephant” – automatically elicited images of the banned subject.

AI apps also appeared prone to failings like “catastrophic forgetting”, where new information dislodged previously acquired knowledge or skills, and “unwarranted confidence”, where “fluency” took precedence over “caution and precision”.

Fine-tuning the bots can exacerbate these problems, the authors speculate. When AI apps are “optimised for helpfulness” they become less inclined to “express uncertainty about questions beyond their parametric knowledge”. A tool that “provides a highly precise but complex answer…may receive lower ratings from human evaluators,” the paper explains.

One summary cited in the paper reinterpreted a finding that a diabetes drug was “better than placebo” as an endorsement of the “effective and safe treatment” option. “Such…generic generalisations could mislead practitioners into using unsafe interventions,” the paper says.  

It offers five strategies to “mitigate the risks” of overgeneralisations in AI summaries. They include using AI firm Anthropic’s “Claude” family of bots, which were found to produce the “most faithful” summaries.

Another recommendation is to lower the bot’s “temperature” setting. Temperature is an adjustable parameter that controls the randomness of the generated text.

Uwe Peters, an assistant professor in theoretical philosophy at Utrecht University and the co-author of the report, said the overgeneralisations “occurred frequently and systematically”.

He said the findings meant there was a risk that even subtle changes to the findings by the AI could “mislead users and amplify misinformation, especially when the outputs appear polished and trustworthy”.

Tech companies should evaluate their models for such tendencies, he added, and share these openly. For universities, it showed an “urgent need for stronger AI literacy” among staff and students.

Sign Up For Daily Newsletter

Be keep up! Get the latest breaking news delivered straight to your inbox.
By signing up, you agree to our Terms of Use and acknowledge the data practices in our Privacy Policy. You may unsubscribe at any time.
Share This Article
Facebook Twitter Email Print
Share
What do you think?
Love0
Sad0
Happy0
Sleepy0
Angry0
Dead0
Wink0
Previous Article Let’s Talk Usability: Unpacking The User Experience Of AI-Assisted Programming | HackerNoon
Next Article How to Scale Any Ecommerce Business Fast: 10 Easy Steps
Leave a comment

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

Stay Connected

248.1k Like
69.1k Follow
134k Pin
54.3k Follow

Latest News

Microsoft Releases A2A .NET SDK for Building Collaborative AI Agents
News
Sony’s noise-canceling WH-1000XM6 are discounted to their Prime Day low
News
Huawei Mate 70 series set to launch with breakthrough features and HarmonyOS NEXT: What to expect from Q4’s most anticipated flagship · TechNode
Computing
OpenAI Just Released Its First Open-Weight Models Since GPT-2
Gadget

You Might also Like

Software

How Tesla’s Autopilot Verdict Cold STIFLE MUSK’s Robotaxi Exapansion

6 Min Read
Software

ChatGPT ‘study mode’ feature aims to encourage critical thinking

4 Min Read
Software

Asus Rog Strix Scar 16 2025 review: Powerful and Flashy

15 Min Read
Software

Academics should not feel guilty about AI use’s environmental impact

6 Min Read
//

World of Software is your one-stop website for the latest tech news and updates, follow us now to get the news that matters to you.

Quick Link

  • Privacy Policy
  • Terms of use
  • Advertise
  • Contact

Topics

  • Computing
  • Software
  • Press Release
  • Trending

Sign Up for Our Newsletter

Subscribe to our newsletter to get our newest articles instantly!

World of SoftwareWorld of Software
Follow US
Copyright © All Rights Reserved. World of Software.
Welcome Back!

Sign in to your account

Lost your password?