🤖 AI Summary
This study addresses the lack of systematic understanding regarding the frequency, quality, and coverage effectiveness of AI-generated tests in real-world software development. Leveraging the AIDev dataset, we conduct an empirical analysis of 2,232 test-related commits, integrating data mining, code structure parsing, and coverage evaluation to provide the first large-scale characterization of AI-generated tests in practice. Our findings reveal that AI contributes 16.4% of all test commits. AI-generated tests are notably longer, exhibit higher assertion density, and follow more linear control flow compared to human-written tests. Crucially, their code coverage performance is on par with manually authored tests and significantly enhances overall project coverage across multiple repositories.
📝 Abstract
Agent-based coding tools have transformed software development practices. Unlike prompt-based approaches that require developers to manually integrate generated code, these agent-based tools autonomously interact with repositories to create, modify, and execute code, including test generation. While many developers have adopted agent-based coding tools, little is known about how these tools generate tests in real-world development scenarios or how AI-generated tests compare to human-written ones.
This study presents an empirical analysis of test generation by agent-based coding tools using the AIDev dataset. We extracted 2,232 commits containing test-related changes and investigated three aspects: the frequency of test additions, the structural characteristics of the generated tests, and their impact on code coverage. Our findings reveal that (i) AI authored 16.4% of all commits adding tests in real-world repositories, (ii) AI-generated test methods exhibit distinct structural patterns, featuring longer code and a higher density of assertions while maintaining lower cyclomatic complexity through linear logic, and (iii) AI-generated tests contribute to code coverage comparable to human-written tests, frequently achieving positive coverage gains across several projects.