To What Extent Does Agent-generated Code Require Maintenance? An Empirical Study

📅 2026-05-07

📈 Citations: 0

✨ Influential: 0

career value

180K/year

🤖 AI Summary

This study investigates the long-term maintainability of large language model–generated code in real-world software projects, focusing on maintenance frequency, human involvement, and types of modifications. Leveraging the AIDev dataset and GitHub repositories, the authors conduct a large-scale empirical analysis of over 1,000 AI-generated and human-written code files across 100 popular repositories, integrating commit histories, change pattern classification, and statistical comparisons. The work presents the first systematic quantification of maintenance characteristics of AI-generated code, revealing that such code undergoes less frequent maintenance and smaller-magnitude changes compared to human-written code. Notably, developers primarily employ AI-generated code for feature extensions rather than bug fixes, challenging the prevailing assumption that AI-generated code requires frequent correction and highlighting its stability and practical utility.

📝 Abstract

LLM-based autonomous coding agents have reshaped software development. While these agents excel at code generation, open questions persist about the long-term maintainability of AI-generated code. This study empirically investigates the maintenance extent, human involvement, and modification types of AI-generated files versus human-authored code. Using the AIDev dataset of AI-generated pull requests and GitHub, we analyzed over 1,000 files and approximately 3,200 changes from 100 popular repositories. Our findings show that: (i) AI-generated files receive less frequent maintenance than human-authored code, with updates affecting only a small fraction of file size; (ii) the most frequent modifications to AI code are feature extensions, whereas human updates focus on bug fixes, and (iii) human developers perform the large majority of this maintenance.

Problem

Research questions and friction points this paper is trying to address.

AI-generated code

code maintainability

software maintenance

LLM-based coding agents

empirical study

Innovation

Methods, ideas, or system contributions that make the work stand out.

AI-generated code

code maintainability

empirical study