ClozeMaster: Fuzzing Rust Compiler by Harnessing LLMs for Infilling Masked Real Programs

📅 2026-05-01

📈 Citations: 0

✨ Influential: 0

career value

179K/year

🤖 AI Summary

This work addresses the challenge of generating syntactically valid test programs that effectively trigger deep-seated bugs in the Rust compiler. To this end, the authors propose clozeMask, a novel approach that extracts real code snippets from historical bug reports, applies targeted masks to critical syntactic structures, and leverages large language models (LLMs) to fill these masks, thereby synthesizing new test cases that probe the compiler’s edge-case behavior. By integrating a vulnerability-driven masking strategy with the code generation capabilities of LLMs, clozeMask maintains high syntactic validity while substantially improving test effectiveness. Empirical evaluation demonstrates that the method uncovered 27 confirmed bugs in rustc and mrustc—10 of which have already been fixed—and consistently outperforms existing fuzzing tools in both code coverage and defect detection rates.

📝 Abstract

Ensuring the reliability of the Rust compiler is of paramount importance, given increasing adoption of Rust for critical systems development, due to its emphasis on memory and thread safety. However, generating valid test programs for the Rust compiler poses significant challenges, given Rust's complex syntax and strict requirements. With the growing popularity of large language models (LLMs), much research in software testing has explored using LLMs to generate test cases. Still, directly using LLMs to generate Rust programs often results in a large number of invalid test cases. Existing studies have indicated that test cases triggering historical compiler bugs can assist in software testing. Our investigation into Rust compiler bug issues supports this observation. Inspired by existing work and our empirical research, we introduce a bracket-based masking and filling strategy called clozeMask. The clozeMask strategy involves extracting test code from historical issue reports, identifying and masking code snippets with specific structures, and using an LLM to fill in the masked portions for synthesizing new test programs. This approach harnesses the generative capabilities of LLMs while retaining the ability to trigger Rust compiler bugs. It enables comprehensive testing of the compiler's behavior, particularly exploring edge cases. We implemented our approach as a prototype CLOZEMASTER. CLOZEMASTER has identified 27 confirmed bugs for rustc and mrustc, of which 10 have been fixed by developers. Furthermore, our experimental results indicate that CLOZEMASTER outperforms existing fuzzers in terms of code coverage and effectiveness.

Problem

Research questions and friction points this paper is trying to address.

Rust compiler

test program generation

fuzzing

compiler bugs

validity

Innovation

Methods, ideas, or system contributions that make the work stand out.

clozeMask

LLM-based fuzzing

Rust compiler testing