Constraint-Guided Multi-Agent Decompilation for Executable Binary Recovery

📅 2026-04-26
📈 Citations: 0
Influential: 0
📄 PDF

career value

199K/year
🤖 AI Summary
Existing decompilers often produce code that is syntactically incorrect, uncompilable, or behaviorally inaccurate, limiting practical utility. This work proposes MCGD, a multi-agent collaborative framework that introduces, for the first time, a three-tiered constraint system enforcing syntactic correctness, compilability, and behavioral equivalence. By integrating execution-based validation with an LLM-driven iterative repair mechanism, MCGD achieves high-fidelity, re-executable source code recovery. The approach leverages GPT-4o to construct specialized repair agents and employs hierarchical feedback to guide optimization. Evaluated on 1,641 real-world binary samples, MCGD attains a re-executability rate of 84–97%, outperforming baseline methods by 28–89 percentage points and significantly surpassing existing LLM-based decompilation techniques. Over 90% of samples converge within two iterations, with an average cost per sample of only $0.03–0.05.

Technology Category

Application Category

📝 Abstract
Decompilation -- recovering source code from compiled binaries -- is essential for security analysis, malware reverse engineering, and legacy software maintenance. However, existing decompilers produce code that often fails to compile or execute correctly, limiting their practical utility. We present a multi-agent framework that transforms decompiled code into re-executable source through Multi-level Constraint-Guided Decompilation (MCGD). Our approach employs a hierarchical validation pipeline with three constraint levels: (1) syntactic correctness via parsing, (2) compilability via GCC, and (3) behavioral equivalence via LLM-generated test cases. When validation fails, specialized LLM agents iteratively refine the code using structured error feedback. We evaluate our framework on 1,641 real-world binaries from ExeBench across three decompilers (RetDec, Ghidra, and Angr). Our framework achieves 84-97% re-executability, improving baseline decompiler output by 28-89 percentage points. In comparison with state-of-the-art LLM-based decompilation methods using the same GPT-4o backbone, our approach (84.1%) outperforms LLM4Decompile (80.3%), SK2Decompile (73.9%), and SALT4Decompile (61.8%). Our ablation study reveals that execution-based validation is critical: compile-only approaches achieve 0% behavioral correctness despite 91-99% compilation rates. The system converges efficiently, with 90%+ binaries reaching correctness within 2 iterations at an average cost of $0.03-0.05 per binary. Our results demonstrate that constraint-guided agentic refinement can bridge the gap between raw decompiler output and practically useful source code.
Problem

Research questions and friction points this paper is trying to address.

decompilation
executable binary recovery
re-executability
behavioral equivalence
compilability
Innovation

Methods, ideas, or system contributions that make the work stand out.

multi-agent decompilation
constraint-guided refinement
behavioral equivalence
LLM-based code repair
re-executable decompilation