PBFuzz: Agentic Directed Fuzzing for PoV Generation

📅 2025-12-04

📈 Citations: 0

✨ Influential: 0

career value

200K/year

🤖 AI Summary

Existing approaches—such as directed grey-box fuzzing and LLM-driven fuzzing—exhibit low solving efficiency and weak structural preservation when generating Proof-of-Vulnerability (PoV) inputs that must simultaneously satisfy reachability and triggering constraints. Method: This paper proposes the first agent-oriented PoV generation framework, emulating human expert analysis by integrating LLM-powered semantic understanding, customized program path analysis, debugger-in-the-loop feedback optimization, and cost-aware API scheduling. The framework supports autonomous reasoning, persistent memory, and property-based testing. Results: Evaluated on the Magma benchmark, it successfully triggers 57 vulnerabilities, including 17 undetected by state-of-the-art tools. It achieves a mean exposure time of 339 seconds—25.6× faster than AFL++—and incurs only $1.83 per-vulnerability API cost.

Technology Category

Application Category

📝 Abstract

Proof-of-Vulnerability (PoV) input generation is a critical task in software security and supports downstream applications such as path generation and validation. Generating a PoV input requires solving two sets of constraints: (1) reachability constraints for reaching vulnerable code locations, and (2) triggering constraints for activating the target vulnerability. Existing approaches, including directed greybox fuzzing and LLM-assisted fuzzing, struggle to efficiently satisfy these constraints. This work presents an agentic method that mimics human experts. Human analysts iteratively study code to extract semantic reachability and triggering constraints, form hypotheses about PoV triggering strategies, encode them as test inputs, and refine their understanding using debugging feedback. We automate this process with an agentic directed fuzzing framework called PBFuzz. PBFuzz tackles four challenges in agentic PoV generation: autonomous code reasoning for semantic constraint extraction, custom program-analysis tools for targeted inference, persistent memory to avoid hypothesis drift, and property-based testing for efficient constraint solving while preserving input structure. Experiments on the Magma benchmark show strong results. PBFuzz triggered 57 vulnerabilities, surpassing all baselines, and uniquely triggered 17 vulnerabilities not exposed by existing fuzzers. PBFuzz achieved this within a 30-minute budget per target, while conventional approaches use 24 hours. Median time-to-exposure was 339 seconds for PBFuzz versus 8680 seconds for AFL++ with CmpLog, giving a 25.6x efficiency improvement with an API cost of 1.83 USD per vulnerability.

Problem

Research questions and friction points this paper is trying to address.

Automates human-like code analysis for vulnerability proof generation

Addresses inefficiency in satisfying reachability and triggering constraints

Improves speed and cost-effectiveness of vulnerability exposure in fuzzing

Innovation

Methods, ideas, or system contributions that make the work stand out.

Automates human expert process with agentic directed fuzzing framework

Integrates autonomous code reasoning and custom program-analysis tools

Uses property-based testing for efficient constraint solving

🔎 Similar Papers

On the Challenges of Fuzzing Techniques via Large Language Models