🤖 AI Summary
In software development, manual or scripted environment configuration is inefficient and unreliable—especially when onboarding unfamiliar Python codebases. This paper introduces Repo2Run, the first end-to-end LLM agent that fully automates environment setup: from source-code analysis and dependency inference to generating executable Dockerfiles. Its key contributions are: (1) an atomic configuration synthesis mechanism, leveraging dual-environment isolation and rollback support to guarantee execution atomicity; and (2) a structured Dockerfile generator guided by LLM-based planning and iterative sandbox execution feedback, enabling precise translation of successful configuration steps into robust, reproducible image definitions. Evaluated on a benchmark of 420 real-world Python repositories, Repo2Run achieves an 86.0% configuration success rate—outperforming the best baseline by 63.9 percentage points.
📝 Abstract
Environment configuration is a critical yet time-consuming step in software development, especially when dealing with unfamiliar code repositories. While Large Language Models (LLMs) demonstrate the potential to accomplish software engineering tasks, existing methods for environment configuration often rely on manual efforts or fragile scripts, leading to inefficiencies and unreliable outcomes. We introduce Repo2Run, the first LLM-based agent designed to fully automate environment configuration and generate executable Dockerfiles for arbitrary Python repositories. We address two major challenges: (1) enabling the LLM agent to configure environments within isolated Docker containers, and (2) ensuring the successful configuration process is recorded and accurately transferred to a Dockerfile without error. To achieve this, we propose atomic configuration synthesis, featuring a dual-environment architecture (internal and external environment) with a rollback mechanism to prevent environment"pollution"from failed commands, guaranteeing atomic execution (execute fully or not at all) and a Dockerfile generator to transfer successful configuration steps into runnable Dockerfiles. We evaluate Repo2Run~on our proposed benchmark of 420 recent Python repositories with unit tests, where it achieves an 86.0% success rate, outperforming the best baseline by 63.9%.