π€ AI Summary
This study investigates whether AI coding agents overly rely on mocking when automatically generating tests, potentially compromising test readability and maintainability. Through large-scale empirical analysis, we systematically examined code and commit histories from open-source projects in the TypeScript, JavaScript, and Python ecosystems, comparing mocking practices between AI agents and human developers. Our findings reveal, for the first time, that agents exhibit a significantly higher propensity to introduce mocks: 36% of agent-generated test commits include mocking, markedly exceeding the 26% observed among human developers. Moreover, such behavior is prevalent across 60% of repositories with agent activity. These results suggest that, in optimizing for generation efficiency, AI agents may sacrifice thorough validation of real component interactions.
π Abstract
Coding agents have received significant adoption in software development recently. Unlike traditional LLM-based code completion tools, coding agents work with autonomy (e.g., invoking external tools) and leave visible traces in software repositories, such as authoring commits. Among their tasks, coding agents may autonomously generate software tests; however, the quality of these tests remains uncertain. In particular, excessive use of mocking can make tests harder to understand and maintain. This paper presents the first study to investigate the presence of mocks in agent-generated tests of real-world software systems. We analyzed over 1.2 million commits made in 2025 in 2,168 TypeScript, JavaScript, and Python repositories, including 48,563 commits by coding agents, 169,361 commits that modify tests, and 44,900 commits that add mocks to tests. Overall, we find that coding agents are more likely to modify tests and to add mocks to tests than non-coding agents. We detect that (1) 60% of the repositories with agent activity also contain agent test activity; (2) 23% of commits made by coding agents add/change test files, compared with 13% by non-agents; (3) 68% of the repositories with agent test activity also contain agent mock activity; (4) 36% of commits made by coding agents add mocks to tests, compared with 26% by non-agents; and (5) repositories created recently contain a higher proportion of test and mock commits made by agents. Finally, we conclude by discussing implications for developers and researchers. We call attention to the fact that tests with mocks may be potentially easier to generate automatically (but less effective at validating real interactions), and the need to include guidance on mocking practices in agent configuration files.