His2Trans: A Skeleton First Framework for Self Evolving C to Rust Translation with Historical Retrieval

📅 2026-03-03
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Industrial-scale automated migration from C to Rust often suffers from type inference errors, dependency hallucinations, and inefficient repair cycles due to missing build context and insufficient domain knowledge. This work proposes a skeleton-first, self-evolving migration framework that decouples global verification from local code generation through a build-aware, project-level skeleton graph. By mining historical migration traces to extract API usage rules and integrating retrieval-augmented generation (RAG) to guide large language models, the approach enables precise incremental migration. Our method pioneers a paradigm that combines deterministic skeletons with a self-evolving knowledge base, achieving a 99.75% incremental compilation success rate on OpenHarmony modules, reducing unsafe code by 23.6 percentage points compared to C2Rust, and cutting repair overhead by approximately 60%.

Technology Category

Application Category

📝 Abstract
Automated C-to-Rust migration encounters systemic obstacles when scaling from code snippets to industrial projects, mainly because build context is often unavailable ("dependency hell") and domain-specific evolutionary knowledge is missing. As a result, current LLM-based methods frequently cannot reconstruct precise type definitions under complex build systems or infer idiomatic API correspondences, which in turn leads to hallucinated dependencies and unproductive repair loops. To tackle these issues, we introduce His2Trans, a framework that combines a deterministic, build-aware skeleton with self-evolving knowledge extraction to support stable, incremental migration. On the structural side, His2Trans performs build tracing to create a compilable Project-Level Skeleton Graph, providing a strictly typed environment that separates global verification from local logic generation. On the cognitive side, it derives fine-grained API and code-fragment rules from historical migration traces and uses a Retrieval-Augmented Generation (RAG) system to steer the LLM toward idiomatic interface reuse. Experiments on industrial OpenHarmony modules show that His2Trans reaches a 99.75% incremental compilation pass rate, effectively fixing build failures where baselines struggle. On general-purpose benchmarks, it lowers the unsafe code ratio by 23.6 percentage points compared to C2Rust while producing the fewest warnings. Finally, knowledge accumulation studies demonstrate the framework's evolutionary behavior: by continuously integrating verified patterns, His2Trans cuts repair overhead on unseen tasks by about 60%.
Problem

Research questions and friction points this paper is trying to address.

C-to-Rust migration
dependency hell
build context
evolutionary knowledge
idiomatic API correspondence
Innovation

Methods, ideas, or system contributions that make the work stand out.

Skeleton Graph
Self-Evolving Translation
Build-Aware Migration
Retrieval-Augmented Generation (RAG)
C-to-Rust
🔎 Similar Papers
No similar papers found.