๐ค AI Summary
Fuzzing-generated crash-inducing inputs often exhibit high semantic redundancy, hindering root-cause defect localization. To address this, we propose a pre-debugging input refinement method that optimizes for *minimizing the semantic distance between crash-triggering and passing inputs*. Our approach employs differential analysisโdriven, constraint-preserving edit sequences to iteratively simplify crash inputs while preserving their crash behavior. It tightly integrates gray-box fuzzing feedback with a spectrum-based fault localization framework, enabling focused identification of crash conditions and precise localization of defective code. Evaluated on the Magma benchmark, our method significantly reduces the semantic distance between crash and passing inputs, improving spectrum-based fault localization accuracy by 27.3% on average. To our knowledge, this is the first work to formalize minimal semantic difference refinement as a core pre-debugging paradigm.
๐ Abstract
We present DiffMin, a technique that refines a fuzzed crashing input to gain greater similarities to given passing inputs to help developers analyze the crashing input to identify the failure-inducing condition and locate buggy code for debugging. DiffMin iteratively applies edit actions to transform a fuzzed input while preserving the crash behavior. Our pilot study with the Magma benchmark demonstrates that DiffMin effectively minimizes the differences between crashing and passing inputs while enhancing the accuracy of spectrum-based fault localization, highlighting its potential as a valuable pre-debugging step after greybox fuzzing.