Atomizer: An LLM-based Collaborative Multi-Agent Framework for Intent-Driven Commit Untangling

πŸ“… 2026-01-03
πŸ›οΈ arXiv.org
πŸ“ˆ Citations: 0
✨ Influential: 0
πŸ“„ PDF
πŸ€– AI Summary
This work addresses the challenge posed by entangled, unrelated changes in composite code commits, which significantly hinder code comprehension and maintenance. Existing approaches struggle to accurately infer semantic intent and lack mechanisms for iterative refinement. To overcome these limitations, we propose a large language model–based multi-agent collaborative framework that integrates structural and semantic information through an Intent-Oriented Chain-of-Thought (IO-CoT) strategy to disentangle and infer modification intents. The framework further incorporates a grouper and a reviewer to form a human-like collaborative feedback loop, enabling iterative optimization of change grouping. Evaluated on C# and Java datasets, our method outperforms the current state-of-the-art graph clustering approaches by 6.0% and 5.5% on average, respectively, with performance gains exceeding 16% in complex commit scenarios.

Technology Category

Application Category

πŸ“ Abstract
Composite commits, which entangle multiple unrelated concerns, are prevalent in software development and significantly hinder program comprehension and maintenance. Existing automated untangling methods, particularly state-of-the-art graph clustering-based approaches, are fundamentally limited by two issues. (1) They over-rely on structural information, failing to grasp the crucial semantic intent behind changes, and (2) they operate as ``single-pass''algorithms, lacking a mechanism for the critical reflection and refinement inherent in human review processes. To overcome these challenges, we introduce Atomizer, a novel collaborative multi-agent framework for composite commit untangling. To address the semantic deficit, Atomizer employs an Intent-Oriented Chain-of-Thought (IO-CoT) strategy, which prompts large language models (LLMs) to infer the intent of each code change according to both the structure and the semantic information of code. To overcome the limitations of ``single-pass''grouping, we employ two agents to establish a grouper-reviewer collaborative refinement loop, which mirrors human review practices by iteratively refining groupings until all changes in a cluster share the same underlying semantic intent. Extensive experiments on two benchmark C# and Java datasets demonstrate that Atomizer significantly outperforms several representative baselines. On average, it surpasses the state-of-the-art graph-based methods by over 6.0% on the C# dataset and 5.5% on the Java dataset. This superiority is particularly pronounced on complex commits, where Atomizer's performance advantage widens to over 16%.
Problem

Research questions and friction points this paper is trying to address.

composite commits
commit untangling
semantic intent
program comprehension
software maintenance
Innovation

Methods, ideas, or system contributions that make the work stand out.

Intent-Oriented Chain-of-Thought
Multi-Agent Framework
Commit Untangling
Large Language Models
Collaborative Refinement
πŸ”Ž Similar Papers
No similar papers found.
K
Kangchen Zhu
College of Computer Science and Technology, National University of Defense Technology, Changsha, China
Z
Zhiliang Tian
College of Computer Science and Technology, National University of Defense Technology, Changsha, China
Shangwen Wang
Shangwen Wang
National University of Defense Technology
software engineering
M
Mingyue Leng
College of Computer Science and Technology, National University of Defense Technology, Changsha, China
X
Xiaoguang Mao
College of Computer Science and Technology, National University of Defense Technology, Changsha, China