Context-Guided Decompilation: A Step Towards Re-executability

📅 2025-11-03

📈 Citations: 0

✨ Influential: 0

career value

204K/year

🤖 AI Summary

Existing decompilation methods—including LLM-based approaches—struggle to generate source code that is both recompilable and executable, particularly due to semantic information loss induced by compiler optimizations. To address this, we propose the first hybrid decompilation framework incorporating in-context learning, synergistically integrating static analysis with large language models. Our approach explicitly recovers optimization-erased semantic cues via semantic guidance, enabling faithful reconstruction of high-level program structure and behavior. The framework supports diverse compilers and optimization levels without fine-tuning, ensuring broad generalizability. Evaluated on multiple benchmark datasets, our method achieves a ~40% improvement in executable rate over prior state-of-the-art techniques—marking the first systematic breakthrough in achieving practically runnable decompiled code.

Technology Category

Application Category

📝 Abstract

Binary decompilation plays an important role in software security analysis, reverse engineering, and malware understanding when source code is unavailable. However, existing decompilation techniques often fail to produce source code that can be successfully recompiled and re-executed, particularly for optimized binaries. Recent advances in large language models (LLMs) have enabled neural approaches to decompilation, but the generated code is typically only semantically plausible rather than truly executable, limiting their practical reliability. These shortcomings arise from compiler optimizations and the loss of semantic cues in compiled code, which LLMs struggle to recover without contextual guidance. To address this challenge, we propose ICL4Decomp, a hybrid decompilation framework that leverages in-context learning (ICL) to guide LLMs toward generating re-executable source code. We evaluate our method across multiple datasets, optimization levels, and compilers, demonstrating around 40% improvement in re-executability over state-of-the-art decompilation methods while maintaining robustness.

Problem

Research questions and friction points this paper is trying to address.

Generating re-executable source code from optimized binaries

Overcoming semantic loss during compilation with contextual guidance

Improving decompilation reliability through hybrid neural approaches

Innovation

Methods, ideas, or system contributions that make the work stand out.

Hybrid framework using in-context learning

Guides LLMs to generate re-executable code

Improves re-executability by 40% over baselines

🔎 Similar Papers

No similar papers found.