FalseCrashReducer: Mitigating False Positive Crashes in OSS-Fuzz-Gen Using Agentic AI

📅 2025-10-02
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Fuzz-testing-driven driver generation suffers from false-positive crashes—especially in industrial systems like OSS-Fuzz-Gen—due to complex input structures and stringent state constraints, eroding maintainer trust. Method: This paper proposes an AI-agent collaboration framework comprising (1) constraint-guided driver generation, which jointly leverages large language models (LLMs) and static program analysis to automatically infer input constraints; and (2) context-aware crash verification, where multi-agent cooperation models call-site context to distinguish reachable from unreachable crashes. Results: Evaluated on a benchmark of 1,500 real-world functions, the framework reduces false-positive crash rates by 8% and decreases total reported crashes by 52%. It provides the first empirical validation of LLMs as reliable, practical agents for program analysis—demonstrating both efficacy and deployability in production-grade fuzzing infrastructure.

Technology Category

Application Category

📝 Abstract
Fuzz testing has become a cornerstone technique for identifying software bugs and security vulnerabilities, with broad adoption in both industry and open-source communities. Directly fuzzing a function requires fuzz drivers, which translate random fuzzer inputs into valid arguments for the target function. Given the cost and expertise required to manually develop fuzz drivers, methods exist that leverage program analysis and Large Language Models to automatically generate these drivers. However, the generated fuzz drivers frequently lead to false positive crashes, especially in functions highly structured input and complex state requirements. This problem is especially crucial in industry-scale fuzz driver generation efforts like OSS-Fuzz-en, as reporting false positive crashes to maintainers impede trust in both the system and the team. This paper presents two AI-driven strategies to reduce false positives in OSS-Fuzz-Gen, a multi-agent system for automated fuzz driver generation. First, constraint-based fuzz driver generation proactively enforces constraints on a function's inputs and state to guide driver creation. Second, context-based crash validation reactively analyzes function callers to determine whether reported crashes are feasible from program entry points. Using 1,500 benchmark functions from OSS-Fuzz, we show that these strategies reduce spurious crashes by up to 8%, cut reported crashes by more than half, and demonstrate that frontier LLMs can serve as reliable program analysis agents. Our results highlight the promise and challenges of integrating AI into large-scale fuzzing pipelines.
Problem

Research questions and friction points this paper is trying to address.

Reducing false positive crashes in automated fuzz driver generation
Addressing structured input and complex state requirements in functions
Mitigating trust issues from reporting false crashes to maintainers
Innovation

Methods, ideas, or system contributions that make the work stand out.

Constraint-based generation enforces input and state constraints
Context-based validation analyzes function callers reactively
AI agents reduce false positives in fuzz driver generation
🔎 Similar Papers
No similar papers found.
P
Paschal C. Amusuo
Purdue University
Dongge Liu
Dongge Liu
Google
CybersecurityMachine Learning
R
Ricardo Andres Calvo Mendez
Purdue University
J
Jonathan Metzman
Google LLC
O
Oliver Chang
Google LLC
J
James C. Davis
Purdue University