On the Feasibility of Deduplicating Compiler Bugs with Bisection

📅 2025-06-29
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
In compiler testing, duplicate bug reports lead to redundant debugging efforts and increased overhead, necessitating efficient deduplication techniques. This paper proposes BugLens, the first approach to apply bisecting debugging to compiler defect deduplication: it identifies the earliest faulty commit triggering a crash and integrates analysis of compiler optimization behaviors to precisely isolate the critical optimization pass responsible for the error. By jointly modeling error-triggering paths and optimization sequences, BugLens significantly reduces false positives. Experiments across four real-world datasets demonstrate that BugLens reduces manual debugging effort by 26.98% over Tamer and 9.64% over D3, while achieving high accuracy, broad applicability across diverse compilers and optimization levels, and practical deployability. BugLens establishes a scalable, automated paradigm for compiler defect management through systematic, behavior-aware deduplication.

Technology Category

Application Category

📝 Abstract
Random testing has proven to be an effective technique for compiler validation. However, the debugging of bugs identified through random testing presents a significant challenge due to the frequent occurrence of duplicate test programs that expose identical compiler bugs. The process to identify duplicates is a practical research problem known as bug deduplication. Prior methodologies for compiler bug deduplication primarily rely on program analysis to extract bug-related features for duplicate identification, which can result in substantial computational overhead and limited generalizability. This paper investigates the feasibility of employing bisection, a standard debugging procedure largely overlooked in prior research on compiler bug deduplication, for this purpose. Our study demonstrates that the utilization of bisection to locate failure-inducing commits provides a valuable criterion for deduplication, albeit one that requires supplementary techniques for more accurate identification. Building on these results, we introduce BugLens, a novel deduplication method that primarily uses bisection, enhanced by the identification of bug-triggering optimizations to minimize false negatives. Empirical evaluations conducted on four real-world datasets demonstrate that BugLens significantly outperforms the state-of-the-art analysis-based methodologies Tamer and D3 by saving an average of 26.98% and 9.64% human effort to identify the same number of distinct bugs. Given the inherent simplicity and generalizability of bisection, it presents a highly practical solution for compiler bug deduplication in real-world applications.
Problem

Research questions and friction points this paper is trying to address.

Identifying duplicate compiler bugs efficiently
Reducing computational overhead in bug deduplication
Improving accuracy in compiler bug identification
Innovation

Methods, ideas, or system contributions that make the work stand out.

Uses bisection for compiler bug deduplication
Enhances bisection with bug-triggering optimizations
Outperforms analysis-based methods in human effort
🔎 Similar Papers
No similar papers found.