🤖 AI Summary
Existing directed grey-box fuzzing (DGF) relies on physical path distance between seeds and targets, ignoring semantic logical relationships among code segments—leading to inaccurate targeting and redundant guidance in complex binaries.
Method: We propose “attention distance”—the first integration of large language model (LLM) attention mechanisms into fuzzing—to model semantic logical associations among code elements, replacing traditional path-distance metrics. Our approach jointly leverages LLM-based attention analysis, static and dynamic program feature extraction, and is plug-and-play compatible with frameworks such as AFLGo.
Contribution/Results: Evaluated on 38 real-world vulnerability reproductions, our method achieves 3.43× average speedup over baseline DGF, outperforming DAFL and WindRanger by 2.89× and 7.13×, respectively. Moreover, it generalizes effectively—enhancing the performance of other state-of-the-art fuzzers through transferable guidance.
📝 Abstract
In the domain of software security testing, Directed Grey-Box Fuzzing (DGF) has garnered widespread attention for its efficient target localization and excellent detection performance. However, existing approaches measure only the physical distance between seed execution paths and target locations, overlooking logical relationships among code segments. This omission can yield redundant or misleading guidance in complex binaries, weakening DGF's real-world effectiveness. To address this, we introduce extbf{attention distance}, a novel metric that leverages a large language model's contextual analysis to compute attention scores between code elements and reveal their intrinsic connections. Under the same AFLGo configuration -- without altering any fuzzing components other than the distance metric -- replacing physical distances with attention distances across 38 real vulnerability reproduction experiments delivers a extbf{3.43$ imes$} average increase in testing efficiency over the traditional method. Compared to state-of-the-art directed fuzzers DAFL and WindRanger, our approach achieves extbf{2.89$ imes$} and extbf{7.13$ imes$} improvements, respectively. To further validate the generalizability of attention distance, we integrate it into DAFL and WindRanger, where it also consistently enhances their original performance. All related code and datasets are publicly available at https://github.com/TheBinKing/Attention_Distance.git.