RefAgent: A Multi-agent LLM-based Framework for Automatic Software Refactoring

📅 2025-11-05

📈 Citations: 0

✨ Influential: 0

career value

180K/year

🤖 AI Summary

Traditional LLMs rely on static instructions for software refactoring, limiting their ability to dynamically adapt to context and make autonomous decisions. This paper introduces RefAgent—the first end-to-end, multi-agent LLM framework for software refactoring—featuring a four-phase collaborative mechanism: planning, execution, testing, and introspective optimization, enabling context-aware refactoring decisions and iterative quality improvement. Its key contributions are: (1) a specialized multi-agent architecture with role-based division of labor; (2) integrated self-reflection and tool-calling capabilities; and (3) a closed-loop, verification-driven paradigm for quality enhancement. Evaluated across eight Java projects, RefAgent achieves a 90% unit test pass rate, reduces code smell median by 52.5%, improves architectural quality attributes by 8.6%, and attains a 79.15 F1-score for refactoring identification—significantly outperforming both single-agent baselines and conventional search-based approaches.

Technology Category

Application Category

📝 Abstract

Large Language Models (LLMs) have substantially influenced various software engineering tasks. Indeed, in the case of software refactoring, traditional LLMs have shown the ability to reduce development time and enhance code quality. However, these LLMs often rely on static, detailed instructions for specific tasks. In contrast, LLM-based agents can dynamically adapt to evolving contexts and autonomously make decisions by interacting with software tools and executing workflows. In this paper, we explore the potential of LLM-based agents in supporting refactoring activities. Specifically, we introduce RefAgent, a multi-agent LLM-based framework for end-to-end software refactoring. RefAgent consists of specialized agents responsible for planning, executing, testing, and iteratively refining refactorings using self-reflection and tool-calling capabilities. We evaluate RefAgent on eight open-source Java projects, comparing its effectiveness against a single-agent approach, a search-based refactoring tool, and historical developer refactorings. Our assessment focuses on: (1) the impact of generated refactorings on software quality, (2) the ability to identify refactoring opportunities, and (3) the contribution of each LLM agent through an ablation study. Our results show that RefAgent achieves a median unit test pass rate of 90%, reduces code smells by a median of 52.5%, and improves key quality attributes (e.g., reusability) by a median of 8.6%. Additionally, it closely aligns with developer refactorings and the search-based tool in identifying refactoring opportunities, attaining a median F1-score of 79.15% and 72.7%, respectively. Compared to single-agent approaches, RefAgent improves the median unit test pass rate by 64.7% and the median compilation success rate by 40.1%. These findings highlight the promise of multi-agent architectures in advancing automated software refactoring.

Problem

Research questions and friction points this paper is trying to address.

Developing automated software refactoring using multi-agent LLM frameworks

Improving code quality by reducing smells and enhancing key attributes

Enhancing refactoring accuracy through specialized planning and testing agents

Innovation

Methods, ideas, or system contributions that make the work stand out.

Multi-agent LLM framework for automated software refactoring

Specialized agents perform planning execution testing iterations

Agents use self-reflection and tool-calling capabilities

🔎 Similar Papers

HyperAgent: Generalist Software Engineering Agents to Solve Coding Tasks at Scale