LegacyTranslate: LLM-based Multi-Agent Method for Legacy Code Translation

📅 2026-03-14

📈 Citations: 0

✨ Influential: 0

career value

154K/year

🤖 AI Summary

This work addresses the challenge of modernizing large legacy systems, where preserving domain logic while aligning with enterprise architecture and shared APIs remains difficult. Direct code generation using large language models often yields non-compilable or poorly integrable outputs. To overcome this, the authors propose LegacyTranslate, a multi-agent framework that introduces collaborative multi-agent mechanisms into legacy code translation. The approach orchestrates a three-stage pipeline—initial translation, API alignment, and iterative refinement—to migrate PL/SQL to Java. By integrating context-aware example retrieval, API knowledge base matching, and compilation feedback–driven refinement, the method significantly enhances code usability and integrability. The initial translation achieves a 45.6% compilability rate and 30.9% test pass rate; after API alignment and iterative optimization, these metrics improve by 8% and 3%, respectively.

Technology Category

Application Category

📝 Abstract

Modernizing large legacy systems remains a major challenge in enterprise environments, particularly when migration must preserve domain-specific logic while conforming to internal architectural frameworks and shared APIs. Direct application of Large Language Models (LLMs) for code translation often produces syntactically valid outputs that fail to compile or integrate within existing production frameworks, limiting their practical adoption in real-world modernization efforts. In this paper, we propose LegacyTranslate, a multi-agent framework for API-aware code translation, developed and evaluated in the context of an ongoing modernization effort at a financial institution migrating approximately 2.5 million lines of PL/SQL to Java. The core idea is to use specialized LLM-based agents, each addressing a different aspect of the translation challenge. Specifically, LegacyTranslate consists of three agents: Initial Translation Agent produces an initial Java translation using retrieved in-context examples; API Grounding Agent aligns the code with existing APIs by retrieving relevant entries from an API knowledge base; and Refinement Agent iteratively refines the output using compiler feedback and API suggestions to improve correctness. Our experiments show that each agent contributes to better translation quality. The Initial Translation Agent alone achieves 45.6% compilable outputs and 30.9% test-pass rate. With API Grounding Agent and Refinement Agent, compilation improves by an additional 8% and test-pass accuracy increases by 3%.

Problem

Research questions and friction points this paper is trying to address.

legacy code translation

Large Language Models

API integration

code modernization

compilation correctness

Innovation

Methods, ideas, or system contributions that make the work stand out.

multi-agent LLM

legacy code translation

API-aware translation