🤖 AI Summary
This work addresses the challenge of automated merge conflict resolution, which is hindered by ambiguous developer intent and intricate cross-file dependencies. To overcome this, the authors propose a novel approach that synergistically integrates program analysis with large language models (LLMs). The method constructs a multi-tier code property graph (MtCPG) to precisely capture cross-file dependencies and employs graph connectivity algorithms to cluster conflicting code regions along with their contextual surroundings. These clusters are then used to generate context-aware prompts that guide the LLM toward producing accurate and coherent resolutions. This framework represents the first deep integration of program analysis and LLMs for merge conflict resolution, significantly outperforming existing baselines—such as MergeGen and WizardMerge—at character, lexical, and semantic levels, yielding solutions that closely resemble human-performed merges.
📝 Abstract
Code merging is a significant challenge, particularly in large-scale projects. Existing solutions, including program analysis and machine learning, show promise but face critical limitations. Program analysis lacks the ability to infer developers' intentions, relying on conservative strategies that offload unresolved conflicts for manual handling. Meanwhile, model-based approaches struggle with conflicts involving complex code dependencies due to insufficient contextual awareness. To address these gaps, we introduce Rover, a novel conflict resolution system that integrates program analysis with large language models (LLMs). To obtain context-aware prompts, we propose Multi-layer Code Property Graph (MtCPG), a new representation capturing inter-file dependencies and enabling contextual analysis for a given conflict. Using graph connectivity algorithms, Rover further clusters conflicting code and associated changes into meaningful "contexts" that guide the LLM in generating accurate resolutions. We compared Rover with standalone LLMs, machine learning baseline MergeGen, and suggestion provider tool WizardMerge with adjacent code as the contexts. Evaluation results show that Rover surpasses all of these approaches in terms of conflict resolution, achieving higher similarity to ground-truth resolutions at character, lexical, and semantic levels.