Data Flows in You: Benchmarking and Improving Static Data-flow Analysis on Binary Executables

📅 2025-05-30

📈 Citations: 0

✨ Influential: 0

career value

184K/year

🤖 AI Summary

Existing static binary data-flow analysis lacks systematic evaluation, suffers from low precision (0.13), and exhibits poorly understood bottlenecks. Method: We introduce the first large-scale, manually annotated benchmark—comprising over 215K micro-benchmarks—and conduct the first quantitative evaluation of three prominent frameworks (angr, Ghidra, and Miasm), revealing pervasive precision limitations. To address these, we propose three model extensions: (i) dynamic data-flow-guided constraint strengthening, (ii) cross-function contextual modeling, and (iii) joint precision–recall optimization. Contribution/Results: Our approach achieves a recall of 0.99 and precision of 0.32—representing a 146% precision improvement over baselines—while significantly outperforming prior methods. Further validation on real-world CVE samples demonstrates its effectiveness in identifying vulnerable instructions, yielding substantial performance gains in practical vulnerability detection scenarios.

Technology Category

Application Category

📝 Abstract

Data-flow analysis is a critical component of security research. Theoretically, accurate data-flow analysis in binary executables is an undecidable problem, due to complexities of binary code. Practically, many binary analysis engines offer some data-flow analysis capability, but we lack understanding of the accuracy of these analyses, and their limitations. We address this problem by introducing a labeled benchmark data set, including 215,072 microbenchmark test cases, mapping to 277,072 binary executables, created specifically to evaluate data- flow analysis implementations. Additionally, we augment our benchmark set with dynamically-discovered data flows from 6 real-world executables. Using our benchmark data set, we evaluate three state of the art data-flow analysis implementations, in angr, Ghidra and Miasm and discuss their very low accuracy and reasons behind it. We further propose three model extensions to static data-flow analysis that significantly improve accuracy, achieving almost perfect recall (0.99) and increasing precision from 0.13 to 0.32. Finally, we show that leveraging these model extensions in a vulnerability-discovery context leads to a tangible improvement in vulnerable instruction identification.

Problem

Research questions and friction points this paper is trying to address.

Evaluating accuracy of binary data-flow analysis tools

Improving static data-flow analysis with model extensions

Benchmarking vulnerability discovery in binary executables

Innovation

Methods, ideas, or system contributions that make the work stand out.

Introducing labeled benchmark dataset for evaluation

Proposing three model extensions for accuracy improvement

Leveraging extensions to enhance vulnerable instruction identification

🔎 Similar Papers

No similar papers found.