🤖 AI Summary
This study addresses the practical challenges faced by large language model (LLM)-driven agents in complex binary reverse engineering, including adversarial obfuscation, time constraints, and heterogeneous architectures. Existing systems commonly suffer from token limitations, insufficient security safeguards, and inadequate robustness. The work presents the first systematic evaluation of static, dynamic, and hybrid agent paradigms in reverse engineering tasks, uncovering their characteristic failure modes and real-world limitations. From a security-oriented perspective, it proposes an evolutionary pathway toward highly robust and controllably automated reverse engineering, offering both a theoretical foundation and key technical directions for future system design.
📝 Abstract
Agentic systems built on large language models (LLMs) are increasingly being used for complex security tasks, including binary reverse engineering (RE). Despite recent growth in popularity and capability, these systems continue to face limitations in realistic settings. Cutting-edge systems still fail in complex RE scenarios that involve obfuscation, timing, and unique architecture. In this work, we examine how agentic systems perform reverse engineering tasks with static, dynamic, and hybrid agents. Through an analysis of existing agentic tool usage, we identify several limitations, including token constraints, struggles with obfuscation, and a lack of program guardrails. From these findings, we outline current challenges and position future directions for system designers to overcome from a security perspective.