MatchFixAgent: Language-Agnostic Autonomous Repository-Level Code Translation Validation and Repair

📅 2025-09-19

📈 Citations: 0

✨ Influential: 0

career value

169K/year

🤖 AI Summary

Existing code translation verification and repair methods suffer from poor cross-language generalizability and heavy reliance on incomplete test suites, leading to spurious equivalence judgments and ineffective fixes. This paper introduces MatchFixAgent—the first language-agnostic, LLM-driven multi-agent framework that decouples semantic analysis, test generation, defect repair, and verdict decision-making to enable end-to-end equivalence verification and automated repair for repository-scale code translation. Its core innovation lies in eliminating dependence on pre-existing tests through feedback-driven, collaborative agents that perform precise semantic comparison and iterative repair. Evaluated on 2,219 real-world translation pairs, MatchFixAgent achieves a 99.2% equivalence judgment accuracy—outperforming state-of-the-art methods in 60.7% of cases—and raises repair success rate to 50.6%, substantially surpassing the prior best of 18.5%.

Technology Category

Application Category

📝 Abstract

Code translation transforms source code from one programming language (PL) to another. Validating the functional equivalence of translation and repairing, if necessary, are critical steps in code translation. Existing automated validation and repair approaches struggle to generalize to many PLs due to high engineering overhead, and they rely on existing and often inadequate test suites, which results in false claims of equivalence and ineffective translation repair. We develop MatchFixAgent, a large language model (LLM)-based, PL-agnostic framework for equivalence validation and repair of translations. MatchFixAgent features a multi-agent architecture that divides equivalence validation into several sub-tasks to ensure thorough and consistent semantic analysis of the translation. Then it feeds this analysis to test agent to write and execute tests. Upon observing a test failure, the repair agent attempts to fix the translation bug. The final (in)equivalence decision is made by the verdict agent, considering semantic analyses and test execution results. We compare MatchFixAgent's validation and repair results with four repository-level code translation techniques. We use 2,219 translation pairs from their artifacts, which cover 6 PL pairs, and are collected from 24 GitHub projects totaling over 900K lines of code. Our results demonstrate that MatchFixAgent produces (in)equivalence verdicts for 99.2% of translation pairs, with the same equivalence validation result as prior work on 72.8% of them. When MatchFixAgent's result disagrees with prior work, we find that 60.7% of the time MatchFixAgent's result is actually correct. In addition, we show that MatchFixAgent can repair 50.6% of inequivalent translation, compared to prior work's 18.5%. This demonstrates that MatchFixAgent is far more adaptable to many PL pairs than prior work, while producing highly accurate validation results.

Problem

Research questions and friction points this paper is trying to address.

Validating functional equivalence in cross-language code translation

Repairing translation bugs without relying on existing test suites

Generalizing validation and repair across multiple programming languages

Innovation

Methods, ideas, or system contributions that make the work stand out.

LLM-based framework for code translation validation

Multi-agent architecture divides validation into sub-tasks

Automated test generation and execution for equivalence verification

🔎 Similar Papers

Automated Test Case Repair Using Language Models