Refining Fuzzed Crashing Inputs for Better Fault Diagnosis

📅 2025-05-05

📈 Citations: 0

✨ Influential: 0

career value

189K/year

🤖 AI Summary

Fuzzing-generated crash-inducing inputs often exhibit high semantic redundancy, hindering root-cause defect localization. To address this, we propose a pre-debugging input refinement method that optimizes for *minimizing the semantic distance between crash-triggering and passing inputs*. Our approach employs differential analysis–driven, constraint-preserving edit sequences to iteratively simplify crash inputs while preserving their crash behavior. It tightly integrates gray-box fuzzing feedback with a spectrum-based fault localization framework, enabling focused identification of crash conditions and precise localization of defective code. Evaluated on the Magma benchmark, our method significantly reduces the semantic distance between crash and passing inputs, improving spectrum-based fault localization accuracy by 27.3% on average. To our knowledge, this is the first work to formalize minimal semantic difference refinement as a core pre-debugging paradigm.

Technology Category

Application Category

📝 Abstract

We present DiffMin, a technique that refines a fuzzed crashing input to gain greater similarities to given passing inputs to help developers analyze the crashing input to identify the failure-inducing condition and locate buggy code for debugging. DiffMin iteratively applies edit actions to transform a fuzzed input while preserving the crash behavior. Our pilot study with the Magma benchmark demonstrates that DiffMin effectively minimizes the differences between crashing and passing inputs while enhancing the accuracy of spectrum-based fault localization, highlighting its potential as a valuable pre-debugging step after greybox fuzzing.

Problem

Research questions and friction points this paper is trying to address.

Refine fuzzed crashing inputs for better fault diagnosis

Minimize differences between crashing and passing inputs

Enhance accuracy of spectrum-based fault localization

Innovation

Methods, ideas, or system contributions that make the work stand out.

Refines fuzzed inputs to match passing inputs

Iteratively applies edits while preserving crashes

Enhances fault localization accuracy effectively

🔎 Similar Papers

On the Challenges of Fuzzing Techniques via Large Language Models