From Empirical Evaluation to Context-Aware Enhancement: Repairing Regression Errors with LLMs

📅 2025-06-16

📈 Citations: 0

✨ Influential: 0

career value

147K/year

🤖 AI Summary

This work addresses the effectiveness bottleneck of Automated Program Repair (APR) in fixing real-world regression bugs. Motivated by the widespread failure of existing APR tools on regression defects, we construct RegMiner4APR—the first high-quality, manually validated Java regression bug benchmark comprising 99 confirmed defects. We propose a context-aware LLM-based repair method that uniquely incorporates bug-inducing changes into prompt engineering: leveraging git diff analysis to extract change-sensitive contextual information. Empirical evaluation reveals that conventional APR tools achieve 0% patch success on RegMiner4APR, whereas our approach boosts LLM-based repair success by 1.8× over change-agnostic baselines. This study establishes the critical role of change-context awareness in regression repair and delivers a reproducible, scalable paradigm for LLM-driven APR.

Technology Category

Application Category

📝 Abstract

[...] Since then, various APR approaches, especially those leveraging the power of large language models (LLMs), have been rapidly developed to fix general software bugs. Unfortunately, the effectiveness of these advanced techniques in the context of regression bugs remains largely unexplored. This gap motivates the need for an empirical study evaluating the effectiveness of modern APR techniques in fixing real-world regression bugs. In this work, we conduct an empirical study of APR techniques on Java regression bugs. To facilitate our study, we introduce RegMiner4APR, a high-quality benchmark of Java regression bugs integrated into a framework designed to facilitate APR research. The current benchmark includes 99 regression bugs collected from 32 widely used real-world Java GitHub repositories. We begin by conducting an in-depth analysis of the benchmark, demonstrating its diversity and quality. Building on this foundation, we empirically evaluate the capabilities of APR to regression bugs by assessing both traditional APR tools and advanced LLM-based APR approaches. Our experimental results show that classical APR tools fail to repair any bugs, while LLM-based APR approaches exhibit promising potential. Motivated by these results, we investigate impact of incorporating bug-inducing change information into LLM-based APR approaches for fixing regression bugs. Our results highlight that this context-aware enhancement significantly improves the performance of LLM-based APR, yielding 1.8x more successful repairs compared to using LLM-based APR without such context.

Problem

Research questions and friction points this paper is trying to address.

Evaluating APR techniques for fixing Java regression bugs

Assessing LLM-based APR effectiveness on real-world regression bugs

Enhancing LLM-based APR with bug-inducing change information

Innovation

Methods, ideas, or system contributions that make the work stand out.

Using LLMs to fix regression bugs

Introducing RegMiner4APR benchmark

Enhancing LLMs with bug-inducing changes

🔎 Similar Papers

Learning vs Retrieval: The Role of In-Context Examples in Regression with LLMs