By using this site, you agree to the Privacy Policy and Terms of Use.
Accept
World of SoftwareWorld of SoftwareWorld of Software
  • News
  • Software
  • Mobile
  • Computing
  • Gaming
  • Videos
  • More
    • Gadget
    • Web Stories
    • Trending
    • Press Release
Search
  • Privacy
  • Terms
  • Advertise
  • Contact
Copyright © All Rights Reserved. World of Software.
Reading: New Research Reassesses the Value of AGENTS.md Files for AI Coding
Share
Sign In
Notification Show More
Font ResizerAa
World of SoftwareWorld of Software
Font ResizerAa
  • Software
  • Mobile
  • Computing
  • Gadget
  • Gaming
  • Videos
Search
  • News
  • Software
  • Mobile
  • Computing
  • Gaming
  • Videos
  • More
    • Gadget
    • Web Stories
    • Trending
    • Press Release
Have an existing account? Sign In
Follow US
  • Privacy
  • Terms
  • Advertise
  • Contact
Copyright © All Rights Reserved. World of Software.
World of Software > News > New Research Reassesses the Value of AGENTS.md Files for AI Coding
News

New Research Reassesses the Value of AGENTS.md Files for AI Coding

News Room
Last updated: 2026/03/06 at 2:24 PM
News Room Published 6 March 2026
Share
New Research Reassesses the Value of AGENTS.md Files for AI Coding
SHARE

Despite widespread industry recommendations, a new ETH Zurich paper concludes that AGENTS.md files may often hinder AI coding agents. The researchers recommend omitting LLM-generated context files entirely and limiting human-written instructions to non-inferable details, such as highly specific tooling or custom build commands.

The team (Thibaud Gloaguen, Niels Mündler, Mark Müller, Veselin Raychev, Martin Vechev) justified the research by noting that while 60,000 open-source repositories currently contain context files such as AGENTS.md, and many agent frameworks feature built-in commands to auto-generate them, there has been no rigorous empirical investigation into whether these files actually improve an AI agent’s ability to resolve real-world coding tasks.

The researchers (one of whom contributed to the Humanity Last Exam benchmark) built AGENTbench, a novel dataset of 138 real-world Python tasks sourced from niche repositories. This setup deliberately avoids the bias of popular benchmarks like SWE-bench, which AI models may have partially memorized. The team tested four agents (Claude 3.5 Sonnet, Codex GPT-5.2 and GPT-5.1 mini, and Qwen Code) across three distinct scenarios: using no context file, an LLM-generated file, and a human-written file. The researchers assessed the real-world impact of repository-level instructions by tracking three proxy indicators: task success rates (as determined by repository unit tests), the number of agent steps, and overall inference costs. All chosen niche repositories featured human-written context files; the first two scenarios were tested by removing or replacing those files.

The researchers found that LLM-generated context files degrade performance, actually reducing the task success rate by an average of 3% compared to providing no context file at all. They also consistently increased the number of steps the agent took, driving up inference costs by over 20%.

On the other hand, human-written files did offer marginal gains, with a 4% average increase in task success rate on AGENTbench. This positive increase, however, is contrasted by a parallel increase in the number of steps, raising costs by up to 19%.

Including information such as an architectural overview or an explanation of the repository structure in AGENTS.md files did not seem to reduce the time the model spent locating relevant files for the task at hand.

To understand why performance dropped while costs increased, the authors conducted a deep trace analysis of the agents’ tool calls and reasoning patterns. Agents generally followed the instructions included in the AGENTS.md file. As a result, they ran more tests, read more files, executed more grep searches, and performed more code-quality checks. While thorough, this behavior was often unnecessary for resolving the specific task at hand. The data points to the extra context forcing reasoning models to “think” harder without yielding better final patches.

The authors concluded by emphasizing the gap between the study’s findings and the current recommendations made to developers using AI code agents:

We find that all context files consistently increase the number of steps required to complete tasks. LLM-generated context files have a marginal negative effect on task success rates, while developer-written ones provide a marginal performance gain.


Our trace analyses show that instructions in context files are generally followed and lead to more testing and a broader exploration; however, they do not function as effective repository overviews. Overall, our results suggest that context files have only a marginal effect on agent behavior and are likely only desirable when manually written. This highlights a concrete gap between current agent-developer recommendations and observed outcomes, and motivates future work on principled ways to automatically generate concise, task-relevant guidance for coding agents.

Developers received the research with interest. One developer noted that the research should actually have developers focus on writing useful AGENTS.md files:

I read the study. I think it does the opposite of what the authors suggest—it’s actually vouching for good AGENTS.md files.

[…] The biggest use case for AGENTS.md files is domain knowledge that the model is not aware of and cannot instantly infer from the project. That is gained slowly over time from seeing the agents struggle due to this deficiency. Exactly the kind of thing very common in closed-source, yet incredibly rare in public GitHub projects that have an AGENTS.md file—the huge majority of which are recent small vibe-coded projects centered around LLMs. If 4% gains are seen on the latter kind of project, which will have a very mixed quality of AGENTS files in the first place, then for bigger projects with high-quality .md‘s they’re invaluable when working with agents.

Another developer noted that context files may just be more useful to developers than to AI harnesses:

I’ve maintained a CLAUDE.md file for about 3 months now across two projects and the improvement is noticeable but not for the reasons you’d expect. The actual token-level context it provides matters less than the fact that writing it forces you to articulate things about your codebase that were previously just in your head. Stuff like “we use this weird pattern for X because of a legacy constraint in Y.” Once that’s written down, the agent picks it up, but so does every new human on the team.

Developers can review the paper online. The use of context files, such as AGENTS.md, CLAUDE.md, or .cursorrules, grew in importance in the second half of 2025, coinciding with a larger push by AI coding agent providers.

Sign Up For Daily Newsletter

Be keep up! Get the latest breaking news delivered straight to your inbox.
By signing up, you agree to our Terms of Use and acknowledge the data practices in our Privacy Policy. You may unsubscribe at any time.
Share This Article
Facebook Twitter Email Print
Share
What do you think?
Love0
Sad0
Happy0
Sleepy0
Angry0
Dead0
Wink0
Previous Article Microsoft's Next Xbox Console Is for Real, and It'll Play PC Games, Too Microsoft's Next Xbox Console Is for Real, and It'll Play PC Games, Too
Next Article AirPods Pro 3 long-term review: Apple's latest earbuds are great with one asterisk AirPods Pro 3 long-term review: Apple's latest earbuds are great with one asterisk
Leave a comment

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

Stay Connected

248.1k Like
69.1k Follow
134k Pin
54.3k Follow

Latest News

Apple on MacBook Neo Design: ‘We’re Certainly Not Making Any Compromises’
Apple on MacBook Neo Design: ‘We’re Certainly Not Making Any Compromises’
News
United Can Now Ban Passengers Who Listen to Audio and Video Without Headphones
United Can Now Ban Passengers Who Listen to Audio and Video Without Headphones
News
Bitcoin Price News: BlackRock IBIT Records 2 Million in Inflows as BTC Recovers Toward K and Smart Presale Capital Flows Into Pepeto
Bitcoin Price News: BlackRock IBIT Records $322 Million in Inflows as BTC Recovers Toward $70K and Smart Presale Capital Flows Into Pepeto
Gadget
Academics should not feel guilty about AI use’s environmental impact
Software

You Might also Like

Apple on MacBook Neo Design: ‘We’re Certainly Not Making Any Compromises’
News

Apple on MacBook Neo Design: ‘We’re Certainly Not Making Any Compromises’

2 Min Read
United Can Now Ban Passengers Who Listen to Audio and Video Without Headphones
News

United Can Now Ban Passengers Who Listen to Audio and Video Without Headphones

2 Min Read
DJI will pay K to the man who accidentally hacked 7,000 Romo robovacs
News

DJI will pay $30K to the man who accidentally hacked 7,000 Romo robovacs

4 Min Read
Dell’s Got Some of the Best Deals on Laptops, Desktops, and Monitors for March
News

Dell’s Got Some of the Best Deals on Laptops, Desktops, and Monitors for March

14 Min Read
//

World of Software is your one-stop website for the latest tech news and updates, follow us now to get the news that matters to you.

Quick Link

  • Privacy Policy
  • Terms of use
  • Advertise
  • Contact

Topics

  • Computing
  • Software
  • Press Release
  • Trending

Sign Up for Our Newsletter

Subscribe to our newsletter to get our newest articles instantly!

World of SoftwareWorld of Software
Follow US
Copyright © All Rights Reserved. World of Software.
Welcome Back!

Sign in to your account

Lost your password?