PR2: Peephole Raw Pointer Rewriting with LLMs for Translating C to Safer Rust

📅 2025-05-07
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Automated C-to-Rust translation via C2RUST frequently introduces raw pointers, undermining memory and thread safety. Method: We propose a function-level “peephole rewriting” approach that systematically replaces local raw pointers with safe Rust abstractions (e.g., `Box`, `Vec`, and references) using a decision-tree-guided prompting strategy for an LLM (GPT-4o-mini), augmented by compile-and-test feedback–driven incremental error repair to achieve end-to-end raw pointer elimination. Contribution/Results: This work introduces the first decision-tree–structured prompt engineering framework and code-change–aware repair strategy for Rust migration. Evaluated on 28 real-world C projects, our method eliminates 13.22% of local raw pointers on average, with a mean runtime of 5.44 hours and cost of $1.46 per project—significantly improving both safety and practicality of generated Rust code.

Technology Category

Application Category

📝 Abstract
There has been a growing interest in translating C code to Rust due to Rust's robust memory and thread safety guarantees. Tools such as C2RUST enable syntax-guided transpilation from C to semantically equivalent Rust code. However, the resulting Rust programs often rely heavily on unsafe constructs--particularly raw pointers--which undermines Rust's safety guarantees. This paper aims to improve the memory safety of Rust programs generated by C2RUST by eliminating raw pointers. Specifically, we propose a peephole raw pointer rewriting technique that lifts raw pointers in individual functions to appropriate Rust data structures. Technically, PR2 employs decision-tree-based prompting to guide the pointer lifting process. Additionally, it leverages code change analysis to guide the repair of errors introduced during rewriting, effectively addressing errors encountered during compilation and test case execution. We implement PR2 as a prototype and evaluate it using gpt-4o-mini on 28 real-world C projects. The results show that PR2 successfully eliminates 13.22% of local raw pointers across these projects, significantly enhancing the safety of the translated Rust code. On average, PR2 completes the transformation of a project in 5.44 hours, at an average cost of $1.46.
Problem

Research questions and friction points this paper is trying to address.

Eliminate raw pointers in C-to-Rust translated code
Enhance memory safety in Rust programs from C2RUST
Automate pointer lifting using LLM-guided peephole rewriting
Innovation

Methods, ideas, or system contributions that make the work stand out.

Peephole raw pointer rewriting technique
Decision-tree-based prompting for pointer lifting
Code change analysis for error repair
🔎 Similar Papers
No similar papers found.
Y
Yifei Gao
Department of Computer Science, Purdue University, West Lafayette, USA
Chengpeng Wang
Chengpeng Wang
Purdue University
AI for CodeProgram AnalysisSoftware Engineering
P
Pengxiang Huang
Department of Electrical and Computer Engineering, Northwestern University, Evanston, USA
Xuwei Liu
Xuwei Liu
Purdue University
Software EngineeringProgramming Language
Mingwei Zheng
Mingwei Zheng
Purdue University
Large Language ModelsSoftware Engineering
X
Xiangyu Zhang
Department of Computer Science, Purdue University, West Lafayette, USA