From Empirical Evaluation to Context-Aware Enhancement: Repairing Regression Errors with LLMs

πŸ“… 2025-06-16
πŸ“ˆ Citations: 0
✨ Influential: 0
πŸ“„ PDF
πŸ€– AI Summary
This work addresses the effectiveness bottleneck of Automated Program Repair (APR) in fixing real-world regression bugs. Motivated by the widespread failure of existing APR tools on regression defects, we construct RegMiner4APRβ€”the first high-quality, manually validated Java regression bug benchmark comprising 99 confirmed defects. We propose a context-aware LLM-based repair method that uniquely incorporates bug-inducing changes into prompt engineering: leveraging git diff analysis to extract change-sensitive contextual information. Empirical evaluation reveals that conventional APR tools achieve 0% patch success on RegMiner4APR, whereas our approach boosts LLM-based repair success by 1.8Γ— over change-agnostic baselines. This study establishes the critical role of change-context awareness in regression repair and delivers a reproducible, scalable paradigm for LLM-driven APR.

Technology Category

Application Category

πŸ“ Abstract
[...] Since then, various APR approaches, especially those leveraging the power of large language models (LLMs), have been rapidly developed to fix general software bugs. Unfortunately, the effectiveness of these advanced techniques in the context of regression bugs remains largely unexplored. This gap motivates the need for an empirical study evaluating the effectiveness of modern APR techniques in fixing real-world regression bugs. In this work, we conduct an empirical study of APR techniques on Java regression bugs. To facilitate our study, we introduce RegMiner4APR, a high-quality benchmark of Java regression bugs integrated into a framework designed to facilitate APR research. The current benchmark includes 99 regression bugs collected from 32 widely used real-world Java GitHub repositories. We begin by conducting an in-depth analysis of the benchmark, demonstrating its diversity and quality. Building on this foundation, we empirically evaluate the capabilities of APR to regression bugs by assessing both traditional APR tools and advanced LLM-based APR approaches. Our experimental results show that classical APR tools fail to repair any bugs, while LLM-based APR approaches exhibit promising potential. Motivated by these results, we investigate impact of incorporating bug-inducing change information into LLM-based APR approaches for fixing regression bugs. Our results highlight that this context-aware enhancement significantly improves the performance of LLM-based APR, yielding 1.8x more successful repairs compared to using LLM-based APR without such context.
Problem

Research questions and friction points this paper is trying to address.

Evaluating APR techniques for fixing Java regression bugs
Assessing LLM-based APR effectiveness on real-world regression bugs
Enhancing LLM-based APR with bug-inducing change information
Innovation

Methods, ideas, or system contributions that make the work stand out.

Using LLMs to fix regression bugs
Introducing RegMiner4APR benchmark
Enhancing LLMs with bug-inducing changes
πŸ”Ž Similar Papers