🤖 AI Summary
Reverse engineering multilevel and custom-obfuscated code remains challenging due to heavy reliance on manual heuristics, low automation, and the absence of generalizable deobfuscation rules. Method: This paper introduces the first fully automated, SMT-driven reverse-engineering framework. It systematically models obfuscated assembly code as a logical assertion system by integrating assembly semantics modeling, constraint generation, symbolic execution, and Z3-based SMT solving—enabling end-to-end, rule-free functional semantic recovery. Contribution/Results: The framework requires no predefined unpacking or deobfuscation rules and handles complex, chained obfuscation transparently. Evaluated on multiple high-difficulty real-world cases, it successfully reconstructs core functional semantics and supports efficient functional-query verification. This significantly advances the state of the art in automation and efficiency for binary reverse engineering.
📝 Abstract
Software obfuscation techniques make code more difficult
to understand, without changing its functionality. Such techniques
are often used by authors of malicious software to avoid
detection. Reverse Engineering
of obfuscated code, i.e., the process of overcoming obfuscation and
answering questions about the functionality of the code, is
notoriously difficult; and while various tools and methods exist for
this purpose, the process remains complex and slow, especially when
dealing with layered or customized obfuscation techniques.
Here, we present a novel, automated tool for addressing some of the
challenges in reverse engineering of obfuscated code. Our tool,
called ReSMT, converts the obfuscated assembly code into a complex
system of logical assertions that represent the code functionality,
and then applies SMT solving and simulation tools to inspect the
obfuscated code's execution. The approach is mostly automatic,
alleviating the need for highly specialized deobfuscation skills.
In an elaborate case study that we conducted, ReSMT successfully
tackled complex obfuscated code, and was able to solve reverse-engineering
queries about it. We believe that these results showcase the potential
and usefulness of our proposed approach.