PCodeTrans: Translate Decompiled Pseudocode to Compilable and Executable Equivalent

📅 2026-03-16

📈 Citations: 0

✨ Influential: 0

career value

150K/year

🤖 AI Summary

This work proposes the first feedback-driven decompilation framework that integrates dynamic function-level validation with large language models (LLMs) to address the longstanding limitation of traditional decompilers, whose output—while readable—often fails to recompile or faithfully reproduce the original binary’s behavior, thereby undermining reliability in software modernization and vulnerability repair. The approach extracts minimal compilable contexts, performs in-place hot-swapping of reconstructed functions into the original binary, and employs fine-grained differential tracing to iteratively guide the LLM in correcting semantic deviations. Evaluated on Coreutils and Binutils, the method achieves 100% function-level compilability, behavioral consistency of 99.55% and 99.89%, and successfully repairs 76.56% and 79.74% of logical errors, respectively—even on fully stripped binaries, maintaining over 96% consistency and significantly mitigating LLM-induced semantic hallucinations.

Technology Category

Application Category

📝 Abstract

Decompilation is foundational to binary analysis, yet conventional tools prioritize human readability over strict recompilability and verifiable runtime correctness. While recent LLM-based approaches attempt to refine decompiled pseudocode, they typically either optimize solely for readability or rely on static analysis for evaluation. This makes them prone to "semantic hallucinations" that compromise accuracy and fail to resolve actual runtime failures. For critical tasks like software modernization and vulnerability remediation, recovered code must not only compile but replicate the original binary's behavior. We present PCodeTrans, a feedback-driven framework that bridges the gap between decompilation, recompilation, and rigorous function-level dynamic validation. After extracting a minimal yet coherent context to guarantee recompilability, PCodeTrans employs an in situ substitutable engine to hot-swap the compiled function directly into the unmodified binary, natively preserving its authentic execution context and global dependencies. Guided by fine-grained differential tracing, PCodeTrans generates precise runtime feedback to iteratively guide an LLM in repairing semantic discrepancies. Evaluated on Coreutils and Binutils, PCodeTrans achieves unprecedented recovery performance when rectifying raw Hex-Rays outputs, attaining 100% function-level compilability on unstripped binaries alongside 99.55% and 99.89% test-validated behavioral consistency, respectively. In doing so, it resolves 76.56% and 79.74% of logic errors exposed by official test suites. Exhibiting exceptional resilience, PCodeTrans maintains over 96% behavioral consistency even on fully stripped binaries. By significantly outperforming all existing baselines, PCodeTrans paves a practical path to reliably translate decompiled pseudocode into compilable and executable equivalents.

Problem

Research questions and friction points this paper is trying to address.

decompilation

recompilability

runtime correctness

semantic hallucination

binary analysis

Innovation

Methods, ideas, or system contributions that make the work stand out.

decompilation

feedback-driven repair

dynamic validation