๐ค AI Summary
Poor intent understanding in code review hinders effective automated code correction. To address this, we propose an intent-driven two-stage code refinement method. First, we introduce a novel explicit intent modeling framework, designing a three-level, eight-category comment intent classification scheme and integrating rule-based and large language model (LLM)-based approaches for robust intent recognition (79% accuracy). Second, leveraging structured intents as constraints, we guide multiple LLMsโincluding GPT-4o and DeepSeek-V2โto generate high-quality revised code. Through templated intent extraction, a hybrid classifier, and multi-round prompt optimization, our method significantly improves the relevance and executability of generated code, achieving up to 66% correction accuracy. Extensive experiments demonstrate strong generalizability across diverse LLMs and prompting strategies, substantially enhancing code refinement quality and developer collaboration efficiency.
๐ Abstract
Code refinement aims to enhance existing code by addressing issues, refactoring, and optimizing to improve quality and meet specific requirements. As software projects scale in size and complexity, the traditional iterative exchange between reviewers and developers becomes increasingly burdensome. While recent deep learning techniques have been explored to accelerate this process, their performance remains limited, primarily due to challenges in accurately understanding reviewers' intents. This paper proposes an intention-based code refinement technique that enhances the conventional comment-to-code process by explicitly extracting reviewer intentions from the comments. Our approach consists of two key phases: Intention Extraction and Intention Guided Revision Generation. Intention Extraction categorizes comments using predefined templates, while Intention Guided Revision Generation employs large language models (LLMs) to generate revised code based on these defined intentions. Three categories with eight subcategories are designed for comment transformation, which is followed by a hybrid approach that combines rule-based and LLM-based classifiers for accurate classification. Extensive experiments with five LLMs (GPT4o, GPT3.5, DeepSeekV2, DeepSeek7B, CodeQwen7B) under different prompting settings demonstrate that our approach achieves 79% accuracy in intention extraction and up to 66% in code refinement generation. Our results highlight the potential of our approach in enhancing data quality and improving the efficiency of code refinement.