π€ AI Summary
This study systematically evaluates the capabilities and complementarity of fuzzing versus static analysis in detecting memory-unsafe vulnerabilities in C/C++ software. We construct a standardized benchmark comprising over 100 real-world vulnerabilities and empirically compare 13 fuzzers and 5 static analyzers across four dimensions: detection rate, false positive rate (differing by 2β8Γ), resource overhead, and engineering practicality. Our analysis reveals, for the first time across technology stacks, that their vulnerability coverage is highly orthogonal: fuzzing excels at identifying runtime memory corruption (e.g., use-after-free, buffer overflow), whereas static analysis is more effective at detecting logic-driven memory misuse (e.g., improper initialization, incorrect pointer arithmetic). Based on these findings, we propose a βco-evolutionβ paradigm that explicitly characterizes the trade-off boundaries between the two approaches. This work provides empirical evidence and a methodological foundation for tool selection and integration in industrial practice.
π Abstract
Even today, over 70% of security vulnerabilities in critical software systems result from memory safety violations. To address this challenge, fuzzing and static analysis are widely used automated methods to discover such vulnerabilities. Fuzzing generates random program inputs to identify faults, while static analysis examines source code to detect potential vulnerabilities. Although these techniques share a common goal, they take fundamentally different approaches and have evolved largely independently. In this paper, we present an empirical analysis of five static analyzers and 13 fuzzers, applied to over 100 known security vulnerabilities in C/C++ programs. We measure the number of bug reports generated for each vulnerability to evaluate how the approaches differ and complement each other. Moreover, we randomly sample eight bug-containing functions, manually analyze all bug reports therein, and quantify false-positive rates. We also assess limits to bug discovery, ease of use, resource requirements, and integration into the development process. We find that both techniques discover different types of bugs, but there are clear winners for each. Developers should consider these tools depending on their specific workflow and usability requirements. Based on our findings, we propose future directions to foster collaboration between these research domains.