🤖 AI Summary
Rust’s `unsafe` code may introduce memory-safety vulnerabilities, yet existing detection tools inadequately support Rust-specific constructs—such as generics, traits, and macros—and rely heavily on manual intervention. To address this, we propose a synergistic static analysis and large language model (LLM)-guided approach for automated fuzzing harness generation. Our method introduces a novel generic customization replacement mechanism and leverages CodeLlama to dynamically enhance harnesses, enabling realistic user-behavior simulation and exploration of complex API interactions. It integrates precise Rust type resolution, custom type injection, and compatibility with AFL++ and LibFuzzer. Evaluated on 27 real-world crates, our technique successfully reproduced 20 known vulnerabilities and discovered 6 previously unknown ones. It achieves significantly higher vulnerability detection rates compared to state-of-the-art tools, demonstrating both scalability and precision in identifying memory-safety flaws in Rust’s `unsafe` code.
📝 Abstract
Although Rust ensures memory safety by default, it also permits the use of unsafe code, which can introduce memory safety vulnerabilities if misused. Unfortunately, existing tools for detecting memory bugs in Rust typically exhibit limited detection capabilities, inadequately handle Rust-specific types, or rely heavily on manual intervention. To address these limitations, we present deepSURF, a tool that integrates static analysis with Large Language Model (LLM)-guided fuzzing harness generation to effectively identify memory safety vulnerabilities in Rust libraries, specifically targeting unsafe code. deepSURF introduces a novel approach for handling generics by substituting them with custom types and generating tailored implementations for the required traits, enabling the fuzzer to simulate user-defined behaviors within the fuzzed library. Additionally, deepSURF employs LLMs to augment fuzzing harnesses dynamically, facilitating exploration of complex API interactions and significantly increasing the likelihood of exposing memory safety vulnerabilities. We evaluated deepSURF on 27 real-world Rust crates, successfully rediscovering 20 known memory safety bugs and uncovering 6 previously unknown vulnerabilities, demonstrating clear improvements over state-of-the-art tools.