🤖 AI Summary
This study addresses the lack of systematic investigation into the behavioral patterns of AI-powered software development agents during real-world Java code refactoring and their impact on code quality. Conducting the first empirical analysis of its kind, the work compares refactoring pull requests submitted by AI agents and human developers across 86 open-source projects. Using RefactoringMiner and DesigniteJava 3.0, it identifies refactoring types and code smells, followed by quantitative and statistically significant comparisons. The findings reveal that AI agents predominantly focus on annotation-related changes, whereas human developers prioritize structural improvements. Notably, only the Cursor model significantly introduced more code smells after refactoring. These results highlight the current limitations of AI-driven refactoring behaviors and provide empirical grounding for the future design of AI-assisted refactoring tools.
📝 Abstract
Software development agents such as Claude Code, GitHub Copilot, Cursor Agent, Devin, and OpenAI Codex are being increasingly integrated into developer workflows. While prior work has evaluated agent capabilities for code completion and task automation, there is little work investigating how these agents perform Java refactoring in practice, the types of changes they make, and their impact on code quality. In this study, we present the first analysis of agentic refactoring pull requests in Java, comparing them to developer refactorings across 86 projects per group. Using RefactoringMiner and DesigniteJava 3.0, we identify refactoring types and detect code smells before and after refactoring commits. Our results show that agent refactorings are dominated by annotation changes (the 5 most common refactoring types done by agents are annotation related), in contrast to the diverse structural improvements typical of developers. Despite these differences in refactoring types, we find Cursor to be the only model to show a statistically significant increase in refactoring smells.