🤖 AI Summary
This work addresses the challenge of automating the migration of C code to memory-safe Rust. Methodologically, it introduces an end-to-end translation framework leveraging large language models (LLMs), integrating few-shot guided error correction, error-type-aware prompt engineering, unit-test-driven verification, and multi-round iterative compilation/execution feedback. Crucially, it presents the first systematic evaluation of how such translation mitigates memory-safety vulnerabilities—including buffer overflows and null-pointer dereferences. Experiments span six mainstream LLMs, including GPT-4o; for GPT-4o, translation success rate improves from 54% to 80%. Moreover, all memory-safety vulnerabilities identified in the original C code are fully eliminated in the generated Rust code—demonstrating both semantic fidelity and tangible security enhancement.
📝 Abstract
Rust is a strong contender for a memory-safe alternative to C as a"systems"programming language, but porting the vast amount of existing C code to Rust is a daunting task. In this paper, we evaluate the potential of large language models (LLMs) to automate the transpilation of C code to idiomatic Rust, while ensuring that the generated code mitigates any memory-related vulnerabilities present in the original code. To that end, we present the design and implementation of SafeTrans, a framework that uses LLMs to i) transpile C code into Rust and ii) iteratively fix any compilation and runtime errors in the resulting code. A key novelty of our approach is the introduction of a few-shot guided repair technique for translation errors, which provides contextual information and example code snippets for specific error types, guiding the LLM toward the correct solution. Another novel aspect of our work is the evaluation of the security implications of the transpilation process, i.e., whether potential vulnerabilities in the original C code have been properly addressed in the translated Rust code. We experimentally evaluated SafeTrans with six leading LLMs and a set of 2,653 C programs accompanied by comprehensive unit tests, which were used for validating the correctness of the translated code. Our results show that our iterative repair strategy improves the rate of successful translations from 54% to 80% for the best-performing LLM (GPT-4o), and that all types of identified vulnerabilities in the original C code are effectively mitigated in the translated Rust code.