A Study of Library Usage in Agent-Authored Pull Requests

📅 2025-12-12

📈 Citations: 0

✨ Influential: 0

career value

192K/year

🤖 AI Summary

Prior studies lack empirical evidence on how AI coding agents utilize third-party libraries in real-world software development. Method: We conduct the first large-scale empirical analysis of library usage patterns across 26,760 real-world pull requests (PRs), employing dependency parsing, version string identification, and library clustering to quantify import frequency, new dependency introduction rates, and version declaration practices. Contribution/Results: (1) 29.5% of PRs contain library imports, yet only 1.3% introduce *new* dependencies—indicating strong preference for reusing existing dependencies over expanding the dependency graph; (2) explicit version declaration occurs in 75.0% of imports, markedly exceeding rates observed with traditional LLMs; (3) the breadth and diversity of external libraries used significantly surpass those reported in prior non-agent-based studies, refuting the “narrow preference” hypothesis. These findings establish a critical empirical benchmark for assessing the engineering soundness and ecosystem impact of AI coding agents.

Technology Category

Application Category

📝 Abstract

Coding agents are becoming increasingly capable of completing end-to-end software engineering workflows that previously required a human developer, including raising pull requests (PRs) to propose their changes. However, we still know little about how these agents use libraries when generating code, a core part of real-world software development. To fill this gap, we study 26,760 agent-authored PRs from the AIDev dataset to examine three questions: how often do agents import libraries, how often do they introduce new dependencies (and with what versioning), and which specific libraries do they choose? We find that agents often import libraries (29.5% of PRs) but rarely add new dependencies (1.3% of PRs); and when they do, they follow strong versioning practices (75.0% specify a version), an improvement on direct LLM usage where versions are rarely mentioned. Generally, agents draw from a surprisingly diverse set of external libraries, contrasting with the limited "library preferences" seen in prior non-agentic LLM studies. Our results offer an early empirical view into how AI coding agents interact with today's software ecosystems.

Problem

Research questions and friction points this paper is trying to address.

Analyze library import frequency in agent-authored pull requests.

Examine new dependency introduction and versioning practices by agents.

Identify specific library choices and diversity in agent-generated code.

Innovation

Methods, ideas, or system contributions that make the work stand out.

Analyzes library import frequency in AI-generated code

Measures new dependency addition rates and versioning practices

Identifies diverse external library usage patterns in agents

🔎 Similar Papers

Do Developers Adopt Green Architectural Tactics for ML-Enabled Systems? A Mining Software Repository Study