In the high-stakes world of AI, ‘Context Engineering’ has emerged as the latest frontier for squeezing performance out of LLMs. Industry leaders have praised AGENTS.md Its cousins include CLAUDE.md) as the ultimate configuration point for coding agents—a repository-level ‘North Star’ injected into every conversation to guide the AI through complex codebases.
Researchers at the University of California, Berkeley have recently conducted a study that shows how important it is to keep up with current research. ETH Zurich Just a huge reality check. If you do not take care with the context files that you create, then you will likely be sabotaging agent performance and paying 20% more for this privilege.
It’s all about the data: more tokens = less success
The ETH Zurich team analyzed coding agent like Sonnet-4.5, GPT-5.2” Qwen3-30B Overcoming established benchmarks as well as a new set of tasks in the real world called AGENTBENCH. The results of the poll were surprising lopsided.
- Auto-Generated TaxContext files generated automatically Reduced success rates of approximately 3%.
- The Cost of ‘Help‘: These files increased inference costs by Over 20% That is, it takes more logic to do the same thing.
- Human MarginEven manually-written files provided only a Performance Gain of 4%.
- The Intelligence CapInteresting, stronger models like GPT-5.2, which are used to create these files do not give better results. Stronger models often have enough ‘parametric knowledge’ of common libraries that the extra context becomes redundant noise.
Why ‘Good’ Context Fails
The team of researchers has identified an important behavioral trap. Artificial intelligence agents can be too obedient. Coders tend to follow the context file instructions, even if they aren’t necessary.
Researchers found, for example, that Overviews of codebases and directory listings—a staple of most AGENTS.md files—did not help agents navigate faster. Agents are surprisingly good at discovering file structures on their own; reading a manual listing just consumes reasoning tokens and adds ‘mental’ overhead. LLM generated files are also often redundant when you have good documentation already in your repository.

Context Engineering and the New Rules
To make context files actually helpful, you need to shift from ‘comprehensive documentation’ to ‘surgical intervention.’
1. What to Include (The ‘Vital Few’)
- The Technical Stack & Intent: Explain the ‘What’ and the ‘Why.’ Aid the agent to understand the architecture and purpose of the project (e.g. a monorepo).
- The Non-Obvious ToolingYou can find it here
AGENTS.mdshines. Specific tools can be used to test and validate changes, such asThe uvInstead ofраниму pYou can also find out more aboutThe bunny is a good example of a person who has a different view.Instead ofNpm. - The Multiplier EffectData shows that the instructions are not followed. The following are some of the most effective ways to improve your business. Tools mentioned in the context file get used more frequently. The tool Beispiel For example, the tool for example, the toolsparensparensparensparenzeugsparen? Using the tool Suppose you use the tool for instance, the tool a.
The uvUse The 160x increase in frequency If explicitly stated (1.6 times as opposed to 0.01), this is equivalent to a 1.6-fold increase in frequency.
2. What to Exclude (The ‘Noise’)
- Detailled Directory TreesYou can skip them. The agents can locate the documents they require without using a map.
- Style guidesDo not waste your tokens by telling the agent what to do. “use camelCase.” Use deterministic linters and formatters instead—they are cheaper, faster, and more reliable.
- Instructions specific to a TaskAvoid rules which only cover a small part of the issues you face.
- Unvetted auto-ContentDo not let the agent create its own file context without human review. The study proves that ‘stronger’ models don’t necessarily make better guides.
3. The Structure of It
- Maintain a Lean BodyThe consensus is that high performance context files are the best option. Under 300 Lines. Professional teams often keep theirs even tighter—under 60 lines. Because every session is different, every line has a value.
- Progressive DisclosureDo not put all the information in one file. You can use the main file as a way to direct the agent to task-specific documents (e.g.
agent_docs/testing.md( only as applicable. - Pointers over CopiesInstead of using code that eventually becomes stale and is difficult to maintain, embed Pointers (e.g.,
file:lineShow the agent specific designs or interfaces.
What you need to know
- The Negative Effects of Auto-GenerationContext files generated by LLM tend to lower task completion rates of approximately 10% 3% On average, providing repository context is better than not doing so at all.
- Major Cost IncreasesInclude context files to increase the cost of inference by Over 20% This leads to an increase in the number of tasks that agents must complete.
- Minimal Human BenefitAlthough context files written by developers perform slightly better, the improvement is only marginal. 4% over using no context files.
- Redundancy, Navigation and the Reduction of CostThe codebase details in context files can be found elsewhere and are redundant. They do not speed up the search for relevant files.
- Follow strict instructionsThe model is often hampered by unrealistic or unnecessary requirements.
Click here to find out more Paper. Also, feel free to follow us on Twitter Don’t forget about our 120k+ ML SubReddit Subscribe now our Newsletter. Wait! What? now you can join us on telegram as well.


