๐ค AI Summary
Existing automated code review methods often produce off-topic or overly generic comments. To address this, we propose RARe, the first framework to integrate Retrieval-Augmented Generation (RAG) into code review: a dense retriever precisely identifies semantically relevant historical review cases from a repository, and a large language model leverages in-context learning to generate high-quality, issue-focused feedback. RARe synergistically combines retrieval accuracy with generative flexibility, significantly improving comment relevance and explainability. On two benchmark datasets, RARe achieves BLEU-4 scores of 12.32 and 12.96โoutperforming current state-of-the-art methods. Comprehensive human evaluation and interpretability analysis further confirm its effectiveness and practical utility.
๐ Abstract
Code review is essential for maintaining software quality but is labor-intensive. Automated code review generation offers a promising solution to this challenge. Both deep learning-based generative techniques and retrieval-based methods have demonstrated strong performance in this task. However, despite these advancements, there are still some limitations where generated reviews can be either off-point or overly general. To address these issues, we introduce Retrieval-Augmented Reviewer (RARe), which leverages Retrieval-Augmented Generation (RAG) to combine retrieval-based and generative methods, explicitly incorporating external domain knowledge into the code review process. RARe uses a dense retriever to select the most relevant reviews from the codebase, which then enrich the input for a neural generator, utilizing the contextual learning capacity of large language models (LLMs), to produce the final review. RARe outperforms state-of-the-art methods on two benchmark datasets, achieving BLEU-4 scores of 12.32 and 12.96, respectively. Its effectiveness is further validated through a detailed human evaluation and a case study using an interpretability tool, demonstrating its practical utility and reliability.