ReCodeAgent: A Multi-Agent Workflow for Language-agnostic Translation and Validation of Large-scale Repositories

📅 2026-04-08

📈 Citations: 0

✨ Influential: 0

career value

165K/year

🤖 AI Summary

Existing repository-level code translation approaches struggle to generalize across multiple programming languages and often lack automation and language-agnostic design. This work proposes an autonomous multi-agent workflow that, given only a source codebase and a target language, automatically performs cross-language translation and validation. It achieves, for the first time, high-success-rate multilingual repository-level translation by integrating language-specific analysis tools into a language-agnostic synthesis and verification pipeline, thereby leveraging the collaborative strengths of a multi-agent architecture in program translation. Experiments on 118 real-world projects demonstrate a 60.8% improvement in test pass rates over existing methods, with an average cost of merely \$15.30 per project, substantiating both its effectiveness and economic efficiency.

Technology Category

Application Category

📝 Abstract

Most repository-level code translation and validation techniques have been evaluated on a single source-target programming language (PL) pair, owing to the complex engineering effort required to adapt new PL pairs. Programming agents can enable PL-agnosticism in repository-level code translation and validation: they can synthesize code across many PLs and autonomously use existing tools specific to each PL's analysis. However, state-of-the-art has yet to offer a fully autonomous agentic approach for repository-level code translation and validation of large-scale programs. This paper proposes ReCodeAgent, an autonomous multi-agent approach for language-agnostic repository-level code translation and validation. Users only need to provide the project in the source PL and specify the target PL for ReCodeAgent to automatically translate and validate the entire repository. ReCodeAgent is the first technique to achieve high translation success rates across many PLs. We compare the effectiveness of ReCodeAgent with four alternative neuro-symbolic and agentic approaches to translate 118 real-world projects, with 1,975 LoC and 43 translation units for each project, on average. The projects cover 6 PLs (C, Go, Java, JavaScript, Python, and Rust) and 4 PL pairs (C-Rust, Go-Rust, Java-Python, Python-JavaScript). Our results demonstrate that ReCodeAgent consistently outperforms prior techniques on translation correctness, improving test pass rate by 60.8% on ground-truth tests, with an average cost of $15.3. We also perform process-centric analysis of ReCodeAgent trajectories to confirm its procedural efficiency. Finally, we investigate how the design choices (a multi-agent vs. single-agent architecture) influence ReCodeAgent performance: on average, the test pass rate drops by 40.4%, and trajectories become 28% longer and persistently inefficient.

Problem

Research questions and friction points this paper is trying to address.

code translation

repository-level

language-agnostic

program validation

multi-agent

Innovation

Methods, ideas, or system contributions that make the work stand out.

multi-agent

language-agnostic

repository-level translation