🤖 AI Summary
This work proposes a novel AI-agent-driven, end-to-end code review paradigm to address the challenges posed by the exponential growth of code generated by AI programming tools and the inefficiency and high cognitive load of traditional manual code reviews. The framework integrates large language models and multi-agent systems across five stages—pull request (PR) creation, enhancement, reviewer assignment, AI-assisted review, and PR retrospection—to enable automated processing, intelligent recommendations, and comment generation, while embedding human-in-the-loop quality gates at critical junctures to ensure accountability and transparency. This study presents the first systematic architecture for code review in the era of large models, overcoming the fragmentation of existing tools and identifying six key open challenges—including reliability, bias, and privacy—to establish a clear research agenda for human-AI collaborative software engineering.
📝 Abstract
Code review has evolved for decades, from informal peer checking to today's pull request (PR) workflows, yet it remains a largely manual, uneven, and cognitively demanding process. The rise of Artificial Intelligence (AI) coding assistants has intensified this challenge: while these tools increase code production velocity, they also expand the volume of code requiring review, turning code review into a growing bottleneck. Current AI support remains fragmented, with tools focusing on isolated tasks such as reviewer recommendation, PR description generation, or comment suggestion rather than the end-to-end PR review workflow. In this paper, we review the historical evolution of code review practices and examine the shift driven by large language models (LLMs) and agentic AI systems. We then present a vision for an AI-powered code review workflow combining specialized agents with human-controlled quality gates. Our framework spans five stages: PR Creation, PR Augmentation, Reviewer Selection, AI-Assisted Code Review, and PR Retrospective, with humans retained at key decision points to preserve judgment, accountability, and team-level understanding. We identify major open challenges for responsible adoption, including reliability, bias, privacy, automation bias, transparency, and evaluation, and offer a research agenda for more effective human-AI collaboration in software engineering.