🤖 AI Summary
This study addresses the lack of systematic empirical investigation into merge conflicts arising from AI coding agents in collaborative software development. We present the first large-scale dataset of merge conflicts specifically associated with pull requests submitted by AI agents, constructed through deterministic merge simulation, automated conflict detection, and extensive data collection from GitHub. Analyzing over 142,000 AI-generated pull requests, we identify more than 29,000 conflicting requests—yielding a conflict rate of 27.67%—and extract over 336,000 fine-grained conflict regions. Our work provides the first comprehensive characterization of the frequency, scale, and nature of merge conflicts introduced by AI code contributions, thereby filling a critical gap in empirical research on AI-assisted software development.
📝 Abstract
Software Engineering 3.0 marks a paradigm shift in software development, in which AI coding agents are no longer just assistive tools but active contributors. While prior empirical studies have examined productivity gains and acceptance patterns in AI-assisted development, the challenges associated with integrating agent-generated contributions remain less understood. In particular, merge conflicts, a fundamental aspect of collaborative software development, remain underexplored in this context. In this paper, we present AgenticFlict, a large-scale dataset of textual merge conflicts in AI coding agent pull requests (Agentic PRs). The dataset comprises 142K+ Agentic PRs collected from 59K+ repositories, of which 107K+ are successfully processed through deterministic merge simulation. Our pipeline identifies 29K+ PRs exhibiting merge conflicts, yielding a conflict rate of 27.67%, and extracts 336K+ fine-grained conflict regions across these instances. Our preliminary exploratory analysis indicates that merge conflicts are both frequent and often substantial in AI-generated contributions, with noticeable variation across agents, emphasizing the need to better understand and manage integration challenges in AI-assisted software development. The dataset, code and supplementary materials are available in zenodo: https://doi.org/10.5281/zenodo.19396917.