Guaranteed Guess: A Language Modeling Approach for CISC-to-RISC Transpilation with Testing Guarantees

📅 2025-06-17

📈 Citations: 0

✨ Influential: 0

career value

168K/year

🤖 AI Summary

To address poor portability, low semantic fidelity, and unreliable verification in low-level program translation across CISC–RISC architectures, this paper proposes a “generation + test-based verification” closed-loop paradigm. It leverages pre-trained large language models for instruction-level accurate translation, integrates customized unit tests, and constructs a high-coverage, test-driven verification framework coupled with lightweight execution analysis. Notably, it introduces— for the first time in ISA translation—quantified test coverage metrics and formal semantic correctness guarantees. Experiments show 99% semantic correctness on HumanEval and 49% on BringupBench; compared to Rosetta 2, it achieves 1.73× higher execution speed, 32% lower energy consumption, and 59% reduced memory footprint. This work establishes a verifiable, high-efficiency paradigm for trustworthy cross-architecture translation.

Technology Category

Application Category

📝 Abstract

The hardware ecosystem is rapidly evolving, with increasing interest in translating low-level programs across different instruction set architectures (ISAs) in a quick, flexible, and correct way to enhance the portability and longevity of existing code. A particularly challenging class of this transpilation problem is translating between complex- (CISC) and reduced- (RISC) hardware architectures, due to fundamental differences in instruction complexity, memory models, and execution paradigms. In this work, we introduce GG (Guaranteed Guess), an ISA-centric transpilation pipeline that combines the translation power of pre-trained large language models (LLMs) with the rigor of established software testing constructs. Our method generates candidate translations using an LLM from one ISA to another, and embeds such translations within a software-testing framework to build quantifiable confidence in the translation. We evaluate our GG approach over two diverse datasets, enforce high code coverage (>98%) across unit tests, and achieve functional/semantic correctness of 99% on HumanEval programs and 49% on BringupBench programs, respectively. Further, we compare our approach to the state-of-the-art Rosetta 2 framework on Apple Silicon, showcasing 1.73x faster runtime performance, 1.47x better energy efficiency, and 2.41x better memory usage for our transpiled code, demonstrating the effectiveness of GG for real-world CISC-to-RISC translation tasks. We will open-source our codes, data, models, and benchmarks to establish a common foundation for ISA-level code translation research.

Problem

Research questions and friction points this paper is trying to address.

Translate CISC-to-RISC code accurately and efficiently

Combine LLMs with testing for reliable transpilation

Improve performance and energy efficiency in ISA translation

Innovation

Methods, ideas, or system contributions that make the work stand out.

Combines LLMs with software testing for transpilation

Ensures high code coverage and correctness guarantees

Outperforms Rosetta 2 in speed and efficiency

🔎 Similar Papers

No similar papers found.