🤖 AI Summary
To address the high performance overhead and vulnerability to anti-analysis techniques in dynamic program analysis, this paper proposes the first hardware-assisted dynamic binary analysis framework integrating a kernel module with a user-space library. Leveraging Intel’s Last Branch Record (LBR) and Branch Trace Store (BTS) mechanisms, the framework efficiently captures low-overhead, high-fidelity branch execution traces in kernel space and reconstructs precise control-flow graphs (CFGs). Compared to Intel Pin—a state-of-the-art tool—the average slowdown is reduced from 1053× to just 7× (over 150× speedup), while CFG reconstruction accuracy exceeds 99%. Crucially, the framework demonstrates strong robustness against both benign programs and adversarial samples. Its core contribution is the first OS kernel–user-space co-designed hardware tracing architecture, uniquely balancing efficiency, precision, and evasion resistance.
📝 Abstract
Dynamic program analysis is invaluable for malware detection, debugging, and performance profiling. However, software-based instrumentation incurs high overhead and can be evaded by anti-analysis techniques. In this paper, we propose LibIHT, a hardware-assisted tracing framework that leverages on-CPU branch tracing features (Intel Last Branch Record and Branch Trace Store) to efficiently capture program control-flow with minimal performance impact. Our approach reconstructs control-flow graphs (CFGs) by collecting hardware generated branch execution data in the kernel, preserving program behavior against evasive malware. We implement LibIHT as an OS kernel module and user-space library, and evaluate it on both benign benchmark programs and adversarial anti-instrumentation samples. Our results indicate that LibIHT reduces runtime overhead by over 150x compared to Intel Pin (7x vs 1,053x slowdowns), while achieving high fidelity in CFG reconstruction (capturing over 99% of execution basic blocks and edges). Although this hardware-assisted approach sacrifices the richer semantic detail available from full software instrumentation by capturing only branch addresses, this trade-off is acceptable for many applications where performance and low detectability are paramount. Our findings show that hardware-based tracing captures control flow information significantly faster, reduces detection risk and performs dynamic analysis with minimal interference.