🤖 AI Summary
This work addresses the challenges of translating code with external dependencies using large language models, which often suffer from hallucinated APIs, missing imports, and difficulties in verifying semantic equivalence involving opaque library types. To tackle these issues, the authors propose the first end-to-end verifiable translation framework specifically designed for migrating Go code to Rust. The approach integrates retrieval of publicly available target-language APIs, synthesis of cross-language adapters, and I/O-based semantic equivalence checking to ensure correctness. Evaluated on six real-world Go projects, the method substantially improves both compilation and semantic verification success rates, achieving up to 100% success in the most complex dependency scenarios and yielding an average two-fold improvement over baseline approaches.
📝 Abstract
Large Language Models (LLMs) have shown promise for program translation, particularly for migrating systems code to memory-safe languages such as Rust. However, existing approaches struggle when source programs depend on external libraries: LLMs frequently hallucinate non-existent target APIs and fail to generate call-enabling imports; moreover, validating semantic equivalence is challenging when the code manipulates opaque, library-defined types. We present a translation and validation framework for translating Go projects with external dependencies to Rust. Our approach combines (i) a retrieval mechanism that maps Go library APIs to Rust APIs, and (ii) a cross-language validation pipeline that establishes language interoperability in the presence of opaque library types by synthesising adapters exclusively from public library APIs, prior to validating I/O equivalence. We evaluate our system on six real-world Go repositories with non-trivial external dependencies. Our approach significantly increases both the compilation and equivalence success rate (up to 100% in the most dependency-heavy case; approx. 2x on average) by enabling validated translation that manipulate opaque, library-defined types.