Mut4All: Fuzzing Compilers via LLM-Synthesized Mutators Learned from Bug Reports

📅 2025-07-25
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Manually designing mutation operators for modern compilers (e.g., Rust, C++) is costly and poorly generalizable due to complex language features such as templates and macros. Method: This paper introduces the first language-agnostic, LLM-driven fuzzing framework. It automatically learns mutation patterns from 1,000 real-world bug reports and synthesizes mutation operators via a three-agent collaborative architecture—*Invent*, *Implement*, and *Refine*—guided by unit-test feedback and fine-grained code modification modeling using GPT-4o. Contribution/Results: At an average cost of $0.08 per operator, the framework synthesizes 722 effective mutation operators. These detect 62 vulnerabilities in Rust compilers and 34 in C++ compilers, including 38 previously unknown Rust bugs. This work pioneers deep integration of bug-report knowledge with large language models, significantly improving cross-language vulnerability detection efficiency and coverage.

Technology Category

Application Category

📝 Abstract
Mutation-based fuzzing is effective for uncovering compiler bugs, but designing high-quality mutators for modern languages with complex constructs (e.g., templates, macros) remains challenging. Existing methods rely heavily on manual design or human-in-the-loop correction, limiting scalability and cross-language generalizability. We present Mut4All, a fully automated, language-agnostic framework that synthesizes mutators using Large Language Models (LLMs) and compiler-specific knowledge from bug reports. It consists of three agents: (1) a mutator invention agent that identifies mutation targets and generates mutator metadata using compiler-related insights; (2) a mutator implementation synthesis agent, fine-tuned to produce initial implementations; and (3) a mutator refinement agent that verifies and corrects the mutators via unit-test feedback. Mut4All processes 1000 bug reports (500 Rust, 500 C++), yielding 319 Rust and 403 C++ mutators at ~$0.08 each via GPT-4o. Our customized fuzzer, using these mutators, finds 62 bugs in Rust compilers (38 new, 7 fixed) and 34 bugs in C++ compilers (16 new, 1 fixed). Mut4All outperforms existing methods in both unique crash detection and coverage, ranking first on Rust and second on C++.
Problem

Research questions and friction points this paper is trying to address.

Automating mutator design for compiler fuzzing with LLMs
Overcoming manual mutator limitations in complex languages
Enhancing bug detection in Rust and C++ compilers
Innovation

Methods, ideas, or system contributions that make the work stand out.

LLM-synthesized mutators from bug reports
Automated mutator refinement via unit-test feedback
Language-agnostic framework for compiler fuzzing
🔎 Similar Papers
No similar papers found.
B
Bo Wang
Beijing Jiaotong University
Pengyang Wang
Pengyang Wang
Assistant Professor, University of Macau
data miningrepresentation learningurban computing
C
Chong Chen
Beijing Jiaotong University
Q
Qi Sun
Beijing Jiaotong University
Jieke Shi
Jieke Shi
PhD Candidate & Research Engineer, Singapore Management University
Software EngineeringAI Software Testing
C
Chengran Yang
Singapore Management University
Ming Deng
Ming Deng
上海大学
计算机科学
Y
Youfang Lin
Beijing Jiaotong University
Z
Zhou Yang
University of Alberta
D
David Lo
Singapore Management University