FuzzingBrain V2: A Multi-Agent LLM System for Automated Vulnerability Discovery and Reproduction

📅 2026-05-20

📈 Citations: 0

✨ Influential: 0

career value

201K/year

🤖 AI Summary

This work addresses the high false-positive rates, irreproducible verification, coarse-grained localization, and difficulty in handling cross-function dependencies that plague large language models (LLMs) in vulnerability detection. To overcome these limitations, the authors propose an automated vulnerability discovery and reproduction system based on a multi-agent LLM framework. The system synergistically integrates fuzzing and program analysis through several innovations: control-flow abstraction of suspicious points, a two-tier fuzzing mechanism, logic-driven hierarchical function analysis, and a sophisticated reasoning framework powered by MCP tools and context engineering. Evaluated on the AIxCC 2025 dataset, the approach achieves a 90% detection rate and successfully identifies 29 zero-day vulnerabilities—two of which have been assigned CVE identifiers—all of which were confirmed and patched, demonstrating substantially improved precision and verifiability.

📝 Abstract

Software vulnerabilities pose critical security threats, with nearly 50,000 CVEs reported in 2025. While Large Language Models (LLMs) show promise for automated vulnerability detection, three key challenges remain. First, LLM-generated vulnerability reports suffer from high false positive rates and lack reproducible verification. Second, existing LLM-based approaches use suboptimal granularities for vulnerability localization: function-level analysis overlooks bugs when context becomes extensive, while line-level analysis lacks sufficient context. Third, existing approaches have difficulty reasoning about vulnerabilities with complex cross-function dependencies and triggering conditions. We present FuzzingBrain V2, a multi-agent system that addresses these gaps through four key contributions: (1) fully automated vulnerability analysis built on Google's OSS-Fuzz, ensuring all reported vulnerabilities are fuzzer-reproducible; (2) Suspicious Point, a novel control-flow-based abstraction for precise vulnerability localization at the optimal granularity; (3) logic-driven hierarchical function analysis with dual-layer fuzzing enhancing function coverage under resource constraints; (4) MCP-based static and dynamic analysis tools with context engineering enhancing complex vulnerability reasoning. On the AIxCC 2025 Final Competition C/C++ dataset, FuzzingBrain V2 achieved 90% detection rate (36 of 40 vulnerabilities). In real-world deployment, FuzzingBrain V2 discovered 29 zero-day vulnerabilities across 12 open-source projects, all confirmed and fixed by maintainers, with 2 assigned CVE IDs.

Problem

Research questions and friction points this paper is trying to address.

vulnerability detection

false positives

reproducibility

localization granularity

cross-function dependencies

Innovation

Methods, ideas, or system contributions that make the work stand out.

multi-agent LLM

vulnerability localization

fuzzing