๐ค AI Summary
This study investigates the impact of autonomous coding agents on developer productivity and code quality in real-world software projects, benchmarking them against IDE-integrated AI assistants. Leveraging the AIDev dataset, the authors employ a staggered difference-in-differences approach with matched control groups and conduct a longitudinal causal analysis using multidimensional metricsโincluding static analysis warnings, cognitive complexity, code duplication, and comment density. The work provides the first empirical evidence of heterogeneous effects between the two AI tool types: autonomous agents yield only transient productivity gains when users have no prior AI exposure, yet they significantly increase static analysis warnings by approximately 18% and elevate cognitive complexity by 35%. These findings indicate that autonomous agents introduce persistent complexity debt, underscoring the critical need for targeted deployment strategies and robust quality assurance mechanisms.
๐ Abstract
Large language model (LLM) based coding agents increasingly act as autonomous contributors that generate and merge pull requests, yet their real-world effects on software projects are unclear-especially compared with widely adopted IDE-based AI assistants. We present a longitudinal causal study of agent adoption in open-source repositories using staggered difference-in-differences with matched controls. Using the AIDev dataset, we define adoption as the first agent-generated pull request and analyze monthly repository-level outcomes spanning development velocity (commits, lines added) and software quality (static-analysis warnings, cognitive complexity, duplication, and comment density). Results show large, front-loaded velocity gains only when agents are the first observable AI tool in a project; repositories with prior AI IDE usage experience minimal or short-lived throughput increases. In contrast, quality risks are persistent across settings, with static-analysis warnings and cognitive complexity rising by roughly 18% and 39%, indicating sustained agent-induced technical debt even when velocity advantages fade. These heterogeneous effects suggest diminishing returns to AI assistance and highlight the need for quality safeguards, provenance tracking, and selective deployment of autonomous agents. Our findings establish an empirical basis for understanding how agentic and IDE-based tools interact, and motivate research on balancing acceleration with maintainability in AI-integrated development workflows. The replication package for this study is publicly available at https://github.com/shyamagarwal13/agentic-coding-impact.