A Systematic Literature Review on Large Language Models for Automated Program Repair

📅 2024-05-02

🏛️ arXiv.org

📈 Citations: 39

✨ Influential: 1

career value

160K/year

🤖 AI Summary

Research on large language models (LLMs) for automated program repair (APR) remains fragmented and lacks a systematic, unified understanding. Method: We conduct a systematic literature review (SLR) covering 127 papers published between 2020 and 2024, establishing the first comprehensive conceptual framework for LLM-based APR. We categorize model utilization strategies into three types—fine-tuning, prompt engineering, and hybrid ensemble—and perform multidimensional thematic analysis across input representation, semantic/security-specific repair scenarios, and open-science practices. Contribution/Results: We identify core challenges including model robustness, evaluation bias, and real-world deployment adaptability. The study yields a reusable taxonomy, benchmark insights, and methodological guidelines—delivering the APR community’s first holistic landscape map to precisely identify research gaps and inform future innovation pathways.

Technology Category

Application Category

📝 Abstract

Automated Program Repair (APR) attempts to patch software bugs and reduce manual debugging efforts. Very recently, with the advances in Large Language Models (LLMs), an increasing number of APR techniques have been proposed, facilitating software development and maintenance and demonstrating remarkable performance. However, due to ongoing explorations in the LLM-based APR field, it is challenging for researchers to understand the current achievements, challenges, and potential opportunities. This work provides the first systematic literature review to summarize the applications of LLMs in APR between 2020 and 2024. We analyze 127 relevant papers from LLMs, APR and their integration perspectives. First, we categorize existing popular LLMs that are applied to support APR and outline three types of utilization strategies for their deployment. Besides, we detail some specific repair scenarios that benefit from LLMs, e.g., semantic bugs and security vulnerabilities. Furthermore, we discuss several critical aspects of integrating LLMs into APR research, e.g., input forms and open science. Finally, we highlight a set of challenges remaining to be investigated and the potential guidelines for future research. Overall, our paper provides a systematic overview of the research landscape to the APR community, helping researchers gain a comprehensive understanding of achievements and promote future research.

Problem

Research questions and friction points this paper is trying to address.

Summarizing LLM applications in automated program repair research

Analyzing current achievements and challenges in LLM-based APR

Providing systematic review of LLM utilization strategies for APR

Innovation

Methods, ideas, or system contributions that make the work stand out.

Large Language Models automate software bug patching

Systematic review categorizes four LLM utilization strategies

LLMs address semantic bugs and security vulnerabilities

🔎 Similar Papers

An Empirical Evaluation of Pre-trained Large Language Models for Repairing Declarative Formal Specifications