🤖 AI Summary
This work addresses the persistent challenge in neural decompilation—namely, the generation of semantically incorrect or syntactically invalid code that fails to faithfully reconstruct the original high-level source program. To overcome this limitation, the study introduces, for the first time, a compiler feedback mechanism integrated into the neural decompilation pipeline, coupled with an automated search-based optimization algorithm to iteratively refine model outputs. This approach substantially enhances semantic correctness while preserving high fidelity to the original source code. Evaluated on the ExeBench Real -O2 dataset, the method elevates decompilation success rates from 26.0% to 83.9%. Notably, it remains effective even when applied to weaker neural decompilers, demonstrating broad applicability and practical utility across diverse model capabilities.
📝 Abstract
Decompilers are useful tools used in reverse engineering to understand compiled source code. Reconstructing source code from compiled binaries is a challenging task, because high-level syntax, identifiers, and custom data types are generally lost as the compiler translates human-readable code to low-level machine code. Deterministic decompilers are useful tools for binary analysis, but can struggle to infer idiomatic syntax and identifier names. Generative AI models are a natural fit for reconstructing high-level syntax, identifiers, and types, but they can still suffer by hallucinating improper programming constructs and semantics. Instead of attempting to improve neural decompilers with more data and more training, we argue that compiler feedback can be used to dramatically improve the semantic correctness of neural decompiler outputs via search. Our system, Decaf (DECompilation with Automated Feedback), raises the neural decompilation rate from 26.0% on ExeBench to 83.9% on the Real -O2 split without sacrificing similarity to the original source code. We also find our automatic feedback methodology is highly effective for improving weaker neural decompilation models.