Investigating Autonomous Agent Contributions in the Wild: Activity Patterns and Code Change over Time

📅 2026-04-01
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This study investigates the real-world contributions of autonomous code-generating agents in open-source software projects and their impact on code quality and maintainability. Leveraging a novel dataset comprising approximately 110,000 pull requests, the work presents the first large-scale longitudinal analysis comparing five prominent agent types by tracking the merge outcomes, developer interactions, and long-term evolution of agent-generated code. The findings reveal that, despite steadily increasing agent involvement, code produced by these agents exhibits significantly lower retention stability compared to human-authored code, being more frequently modified or removed in subsequent commits. This highlights a critical challenge regarding the long-term maintenance burden introduced by current AI coding agents, suggesting that their integration into collaborative software development may entail hidden sustainability costs.
📝 Abstract
The rise of large language models for code has reshaped software development. Autonomous coding agents, able to create branches, open pull requests, and perform code reviews, now actively contribute to real-world projects. Their growing role offers a unique and timely opportunity to investigate AI-driven contributions and their effects on code quality, team dynamics, and software maintainability. In this work, we construct a novel dataset of approximately $110,000$ open-source pull requests, including associated commits, comments, reviews, issues, and file changes, collectively representing millions of lines of source code. We compare five popular coding agents, including OpenAI Codex, Claude Code, GitHub Copilot, Google Jules, and Devin, examining how their usage differs in various development aspects such as merge frequency, edited file types, and developer interaction signals, including comments and reviews. Furthermore, we emphasize that code authoring and review are only a small part of the larger software engineering process, as the resulting code must also be maintained and updated over time. Hence, we offer several longitudinal estimates of survival and churn rates for agent-generated versus human-authored code. Ultimately, our findings indicate an increasing agent activity in open-source projects, although their contributions are associated with more churn over time compared to human-authored code.
Problem

Research questions and friction points this paper is trying to address.

autonomous coding agents
code quality
software maintainability
longitudinal analysis
open-source contributions
Innovation

Methods, ideas, or system contributions that make the work stand out.

autonomous coding agents
longitudinal code analysis
code churn
open-source contributions
AI in software engineering
🔎 Similar Papers
2024-03-14IEEE International Conference on Robotics and AutomationCitations: 3
R
Razvan Mihai Popescu
Delft University of Technology
D
David Gros
University of California, Davis
A
Andrei Botocan
Delft University of Technology
Rahul Pandita
Rahul Pandita
Staff Researcher, Github Inc.
Developer ProductivityHCIAutomated Software EngineeringSoftware Security
P
Prem Devanbu
University of California, Davis
Maliheh Izadi
Maliheh Izadi
Assistant Professor @ Delft University of Technology, The Netherlands
Software engineeringEvaluationAI4SELLM4CodeAgents