Partial Cross-Compilation and Mixed Execution for Accelerating Dynamic Binary Translation

📅 2025-11-29
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Cross-ISA program execution faces two fundamental challenges: poor performance of dynamic binary translation (DBT) and the infeasibility of full cross-compilation. This paper proposes a hybrid execution system that synergistically integrates compilation and simulation—departing from conventional all-translation or pure-simulation paradigms—by introducing a fine-grained, function-level offloading mechanism. Leveraging LLVM-based static analysis, the system automatically identifies offloadable functions; it then employs a customized QEMU-based inter-ISA calling interface and a lightweight runtime coordination protocol to enable efficient heterogeneous execution. The core innovations are automated, low-overhead function-level offloading decisions and robust cross-architecture function invocation support. Experimental evaluation demonstrates up to 13× speedup over state-of-the-art DBT systems, significantly reducing translation overhead while maintaining broad applicability and practical deployability.

Technology Category

Application Category

📝 Abstract
With the growing diversity of instruction set architectures (ISAs), cross-ISA program execution has become common. Dynamic binary translation (DBT) is the main solution but suffers from poor performance. Cross-compilation avoids emulation costs but is constrained by an "all-or-nothing" model-programs are either fully cross-compiled or entirely emulated. Complete cross-compilation is often unfeasible due to ISA-specific code or missing dependencies, leaving programs with high emulation overhead. We propose a hybrid execution system that combines compilation and emulation, featuring a selective function offloading mechanism. This mechanism establishes cross-environment calling channels, offloading eligible functions to the host for native execution to reduce DBT overhead. Key optimizations address offloading costs, enabling efficient hybrid operation. Built on LLVM and QEMU, the system works automatically for both applications and libraries. Evaluations show it achieves up to 13x speedups over existing DBT, with strong practical value.
Problem

Research questions and friction points this paper is trying to address.

Accelerates dynamic binary translation via hybrid compilation-emulation
Enables selective function offloading to reduce emulation overhead
Overcomes all-or-nothing cross-compilation limitations for diverse ISAs
Innovation

Methods, ideas, or system contributions that make the work stand out.

Hybrid execution system combining compilation and emulation
Selective function offloading to host for native execution
Built on LLVM and QEMU for automatic operation
Y
Yuhao Gu
Sun Yat-sen University, Guangzhou, China
Z
Zhongchun Zheng
Sun Yat-sen University, Guangzhou, China
N
Nong Xiao
Sun Yat-sen University, Guangzhou, China
Y
Yutong Lu
Sun Yat-sen University, Guangzhou, China
Xianwei Zhang
Xianwei Zhang
Sun Yat-sen U.; AMD Research/RTG
Architecture/SystemCompilationGPU/MemoryHPCSimulation/Modeling