Code Quality Analysis of Translations from C to Rust

📅 2026-01-31
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This study addresses the lack of systematic evaluation of multidimensional code quality—encompassing performance, robustness, and maintainability—in existing automated C-to-Rust translation tools. Using GNU coreutils as a benchmark, the authors conduct the first comprehensive assessment of three state-of-the-art translators—C2Rust, C2SaferRust, and TranslationGym—leveraging Clippy static analysis, GPT-4o–assisted defect detection, and manual review, with comparisons against human-written Rust translations. The findings reveal that while automated tools successfully eliminate certain unsafe patterns inherent in C, they concurrently introduce new issues, and none consistently outperforms manual rewriting across all quality dimensions. Moreover, even human-translated code exhibits deficiencies such as poor readability and non-idiomatic expressions, underscoring the need for a holistic evaluation framework that extends beyond memory safety alone.

Technology Category

Application Category

📝 Abstract
C/C++ is a prevalent programming language. Yet, it suffers from significant memory and thread-safety issues. Recent studies have explored automated translation of C/C++ to safer languages, such as Rust. However, these studies focused mostly on the correctness and safety of the translated code, which are indeed critical, but they left other important quality concerns (e.g., performance, robustness, and maintainability) largely unexplored. This work investigates strengths and weaknesses of three C-to-Rust translators, namely C2Rust (a transpiler), C2SaferRust (an LLM-guided transpiler), and TranslationGym (an LLM-based direct translation). We perform an in-depth quantitative and qualitative analysis of several important quality attributes for the translated Rust code of the popular GNU coreutils, using human-based translation as a baseline. To assess the internal and external quality of the Rust code, we: (i) apply Clippy, a rule-based state-of-the-practice Rust static analysis tool; (ii) investigate the capability of an LLM (GPT-4o) to identify issues potentially overlooked by Clippy; and (iii) perform a manual analysis of the issues reported by Clippy and GPT-4o. Our results show that while newer techniques reduce some unsafe and non-idiomatic patterns, they frequently introduce new issues, revealing systematic trade-offs that are not visible under existing evaluation practices. Notably, none of the automated techniques consistently match or exceed human-written translations across all quality dimensions, yet even human-written Rust code exhibits persistent internal quality issues such as readability and non-idiomatic patterns. Together, these findings show that translation quality remains a multi-dimensional challenge, requiring systematic evaluation and targeted tool support beyond both naive automation and manual rewriting.
Problem

Research questions and friction points this paper is trying to address.

code quality
C to Rust translation
software maintainability
performance
robustness
Innovation

Methods, ideas, or system contributions that make the work stand out.

code translation quality
C-to-Rust transpilation
multi-dimensional code analysis
LLM-assisted code review
static analysis
🔎 Similar Papers
No similar papers found.
B
Biruk Tadesse
North Carolina State University, USA
Vikram Nitin
Vikram Nitin
Columbia University
Machine Learning
M
Mazin Salah
North Carolina State University, USA
Baishakhi Ray
Baishakhi Ray
Associate Professor, Columbia University
Software EngineeringMachine LearningAI4CodeAI4SESE4AI
Marcelo d'Amorim
Marcelo d'Amorim
Associate Professor, NC State University
Software Engineering
W
Wesley Assunção
North Carolina State University, USA