🤖 AI Summary
Legacy applications suffer from poor compatibility, elevated security risks, and high maintenance costs due to technological obsolescence. To address these challenges, this paper proposes VAPU—a Verification-Aware, Agent-based Pipeline for code modernization—built upon multi-agent collaboration that emulates software development team roles. VAPU integrates five large language models (LLMs) and employs temperature-parameter tuning (0–1) to enable staged, autonomous code refactoring and decision-making. Evaluations across 20 open-source Python projects demonstrate that, compared to zero-shot and one-shot prompting baselines, VAPU achieves up to a 22.5% improvement in requirement completion rate under low-temperature settings. Its error control performance matches conventional approaches, while functional requirement fulfillment increases significantly. This work pioneers the application of a verifiable, role-specialized LLM agent pipeline to automated legacy code modernization, empirically validating the efficacy and practicality of the multi-agent paradigm in this domain.
📝 Abstract
In this study, we present a solution for the modernization of legacy applications, an area of code generation where LLM-based multi-agent systems are proving essential for complex multi-phased tasks. Legacy applications often contain deprecated components that create compatibility, security, and reliability risks, but high resource costs make companies hesitate to update. We take a step forward to integrate an LLM-based multi-agent system as part of a legacy web application update to provide a cost-effective solution to update legacy applications autonomously. We propose a multi-agent system named a Verifying Agent Pipeline Updater (VAPU), which is designed to update code files in phases while simulating different roles in a software development team. In our previous study, we evaluated the system for legacy version updates by using six legacy web application view files by resulting errors and accomplished requirements. This study extends the previous evaluation of a multi-agent pipeline system by extending the evaluation of VAPU from a single LLM to five LLMs and using the temperature parameter in both 0 to 1 settings. Additionally, we tested the system with 20 open-source Python GitHub projects. The results of the evaluation were compared to Zero-Shot Learning (ZSL) and One-Shot Learning (OSL) prompts. The extended evaluation of VAPU showed that particularly in a low-temperature VAPU can get similar level of error count compared to the ZSL/OSL prompts but with a higher level of fulfilled requirements, depending on the LLM. VAPU showed up to 22.5% increase in the succeeding Python file update requirements compared to ZSL/OSL prompts. The study indicates that an LLM-based multi-agent system is a capable solution to update components of a legacy application autonomously.