🤖 AI Summary
Existing automated code review models suffer from the quality heterogeneity of open-source review data, where reviewer expertise variations lead to inconsistent feedback quality.
Method: We propose an experience-aware training framework that, for the first time, models reviewer project ownership as dynamic weights and introduces an Experience-aware Loss Function (ELF) to explicitly incorporate software engineering principles—specifically, reviewer expertise—into AI models. A Transformer-based sequence generation model integrates statistical features from author and reviewer historical interactions to construct experience signals, optimized via weighted cross-entropy.
Contribution/Results: Our approach significantly outperforms state-of-the-art methods across multiple benchmarks. Generated review comments exhibit consistent improvements in accuracy, informativeness, and type diversity, empirically validating both the effectiveness and necessity of explicitly modeling reviewer expertise for high-quality comment generation.
📝 Abstract
Modern code review is a ubiquitous software quality assurance process aimed at identifying potential issues within newly written code. Despite its effectiveness, the process demands large amounts of effort from the human reviewers involved. To help alleviate this workload, researchers have trained deep learning models to imitate human reviewers in providing natural language code reviews. Formally, this task is known as code review comment generation. Prior work has demonstrated improvements in this task by leveraging machine learning techniques and neural models, such as transfer learning and the transformer architecture. However, the quality of the model generated reviews remain sub-optimal due to the quality of the open-source code review data used in model training. This is in part due to the data obtained from open-source projects where code reviews are conducted in a public forum, and reviewers possess varying levels of software development experience, potentially affecting the quality of their feedback. To accommodate for this variation, we propose a suite of experience-aware training methods that utilise the reviewers' past authoring and reviewing experiences as signals for review quality. Specifically, we propose experience-aware loss functions (ELF), which use the reviewers' authoring and reviewing ownership of a project as weights in the model's loss function. Through this method, experienced reviewers' code reviews yield larger influence over the model's behaviour. Compared to the SOTA model, ELF was able to generate higher quality reviews in terms of accuracy, informativeness, and comment types generated. The key contribution of this work is the demonstration of how traditional software engineering concepts such as reviewer experience can be integrated into the design of AI-based automated code review models.