🤖 AI Summary
Current agent context files—often termed “agent READMEs”—lack empirical understanding of their actual role as project-level dynamic instructions in agent programming, particularly regarding coverage of non-functional requirements (NFRs) such as security and performance. Method: We conduct the first large-scale empirical study of 2,303 agent context files, combining content encoding with quantitative analysis to characterize usage patterns and evolutionary behavior. Contribution/Results: We find severe functional bias: 62.3% document build commands, 69.9% implementation details, and 67.7% architecture—while NFRs are covered in only 14.5%, posing significant risks for automated code generation. We propose a 16-category instruction taxonomy grounded in observed usage and demonstrate that context files evolve more like configuration code than static documentation. Our findings provide an evidence-based foundation for improving agent toolchains, embedding NFR constraints into context-aware systems, and establishing best practices in context engineering.
📝 Abstract
Agentic coding tools receive goals written in natural language as input, break them down into specific tasks, and write or execute the actual code with minimal human intervention. Central to this process are agent context files ("READMEs for agents") that provide persistent, project-level instructions. In this paper, we conduct the first large-scale empirical study of 2,303 agent context files from 1,925 repositories to characterize their structure, maintenance, and content. We find that these files are not static documentation but complex, difficult-to-read artifacts that evolve like configuration code, maintained through frequent, small additions. Our content analysis of 16 instruction types shows that developers prioritize functional context, such as build and run commands (62.3%), implementation details (69.9%), and architecture (67.7%). We also identify a significant gap: non-functional requirements like security (14.5%) and performance (14.5%) are rarely specified. These findings indicate that while developers use context files to make agents functional, they provide few guardrails to ensure that agent-written code is secure or performant, highlighting the need for improved tooling and practices.