🤖 AI Summary
To address semantic inaccuracy and insufficient contextual utilization in cross-lingual code migration under multilingual settings, this paper proposes a modular agent-based collaborative translation framework. The framework innovatively decouples translation, testing, and repair into independent, specialized agents, enabling context-aware, end-to-end code conversion through task decomposition and cooperative optimization. It integrates structural code analysis, test-driven validation, and iterative error correction to significantly enhance semantic fidelity and robustness. Evaluated on four standard benchmarks, the method achieves an average accuracy of 94.16%, outperforming state-of-the-art approaches by 0.5–13.5 percentage points on 94% of test samples. These results demonstrate both its effectiveness and strong generalization capability across diverse programming languages and code patterns.
📝 Abstract
As software systems evolve, developers increasingly work across multiple programming languages and often face the need to migrate code from one language to another. While automatic code translation offers a promising solution, it has long remained a challenging task. Recent advancements in Large Language Models (LLMs) have shown potential for this task, yet existing approaches remain limited in accuracy and fail to effectively leverage contextual and structural cues within the code. Prior work has explored translation and repair mechanisms, but lacks a structured, agentic framework where multiple specialized agents collaboratively improve translation quality. In this work, we introduce BabelCoder, an agentic framework that performs code translation by decomposing the task into specialized agents for translation, testing, and refinement, each responsible for a specific aspect such as generating code, validating correctness, or repairing errors. We evaluate BabelCoder on four benchmark datasets and compare it against four state-of-the-art baselines. BabelCoder outperforms existing methods by 0.5%-13.5% in 94% of cases, achieving an average accuracy of 94.16%.