Evaluating AGENTS.md: Are Repository-Level Context Files Helpful for Coding Agents?

📅 2026-02-12
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This study presents the first systematic evaluation of the practical impact of repository-level context files—such as AGENTS.md—on the performance of code-generating agents in real-world tasks. By comparing multiple large language model–driven coding agents on the SWE-bench benchmark and a newly curated, human-annotated problem set, with and without access to such context files, the authors find that these files typically reduce task success rates by over 20% and substantially increase computational overhead. The results challenge the assumption that additional contextual information is inherently beneficial, revealing instead that repository-level context often introduces noise and redundancy that impair agent performance. The study recommends retaining only the minimal necessary information to enhance both the efficiency and accuracy of coding agents.

Technology Category

Application Category

📝 Abstract
A widespread practice in software development is to tailor coding agents to repositories using context files, such as AGENTS.md, by either manually or automatically generating them. Although this practice is strongly encouraged by agent developers, there is currently no rigorous investigation into whether such context files are actually effective for real-world tasks. In this work, we study this question and evaluate coding agents'task completion performance in two complementary settings: established SWE-bench tasks from popular repositories, with LLM-generated context files following agent-developer recommendations, and a novel collection of issues from repositories containing developer-committed context files. Across multiple coding agents and LLMs, we find that context files tend to reduce task success rates compared to providing no repository context, while also increasing inference cost by over 20%. Behaviorally, both LLM-generated and developer-provided context files encourage broader exploration (e.g., more thorough testing and file traversal), and coding agents tend to respect their instructions. Ultimately, we conclude that unnecessary requirements from context files make tasks harder, and human-written context files should describe only minimal requirements.
Problem

Research questions and friction points this paper is trying to address.

coding agents
repository-level context
AGENTS.md
task performance
context effectiveness
Innovation

Methods, ideas, or system contributions that make the work stand out.

repository-level context
coding agents
AGENTS.md
task completion performance
context file evaluation
🔎 Similar Papers
No similar papers found.