🤖 AI Summary
This work addresses the challenge of automating library API migration in the absence of real-world migration examples. To overcome this limitation, the authors propose a novel unsupervised approach that leverages large language models (LLMs) to generate initial migration examples without requiring labeled data. These examples are then generalized by an intelligent agent into structured, testable code transformation rules, which are integrated into the PolyglotPiranha framework for execution. This study represents the first integration of LLMs’ zero-shot generation capabilities with programmatic code transformation tools. The method successfully synthesizes reusable and generalizable migration scripts across multiple Python library migration tasks, significantly enhancing the feasibility and practicality of API migration in fully unsupervised settings.
📝 Abstract
Library migration is a common but error-prone task in software development. Developers may need to replace one library with another due to reasons like changing requirements or licensing changes. Migration typically entails updating and rewriting source code manually. While automated migration tools exist, most rely on mining examples from real-world projects that have already undergone similar migrations. However, these data are scarce, and collecting them for arbitrary pairs of libraries is difficult. Moreover, these migration tools often miss out on leveraging modern code transformation infrastructure. In this paper, we present a new approach to automated API migration that sidesteps the limitations described above. Instead of relying on existing migration data or using LLMs directly for transformation, we use LLMs to extract migration examples. Next, we use an Agent to generalize those examples to reusable transformation scripts in PolyglotPiranha, a modern code transformation tool. Our method distills latent migration knowledge from LLMs into structured, testable, and repeatable migration logic, without requiring preexisting corpora or manual engineering effort. Experimental results across Python libraries show that our system can generate diverse migration examples and synthesize transformation scripts that generalize to real-world codebases.