Raiders of the Lost Dependency: Fixing Dependency Conflicts in Python using LLMs

📅 2025-01-27
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Manual resolution of module version conflicts and complex transitive dependencies in Python programs is inefficient and error-prone. Method: This paper introduces PLLM, the first LLM-driven closed-loop dependency repair framework for Python. PLLM integrates retrieval-augmented generation (RAG) with natural language error parsing to dynamically infer required modules and compatible versions, and employs an execution-feedback-enabled iterative testing environment to realize a “diagnose–generate–validate–optimize”闭环. Unlike traditional approaches relying on static dependency graphs or lookup tables, PLLM eliminates dependence on predefined rules or explicit dependency declarations. Results: Evaluated on the HG2.9K dataset, PLLM achieves 15.97% and 21.58% higher repair success rates than ReadPyE and PyEGo, respectively, with particularly notable improvements in cross-version compatibility for numerical computing and machine learning libraries.

Technology Category

Application Category

📝 Abstract
Fixing Python dependency issues is a tedious and error-prone task for developers, who must manually identify and resolve environment dependencies and version constraints of third-party modules and Python interpreters. Researchers have attempted to automate this process by relying on large knowledge graphs and database lookup tables. However, these traditional approaches face limitations due to the variety of dependency error types, large sets of possible module versions, and conflicts among transitive dependencies. This study explores the potential of using large language models (LLMs) to automatically fix dependency issues in Python programs. We introduce PLLM (pronounced"plum"), a novel technique that employs retrieval-augmented generation (RAG) to help an LLM infer Python versions and required modules for a given Python file. PLLM builds a testing environment that iteratively (1) prompts the LLM for module combinations, (2) tests the suggested changes, and (3) provides feedback (error messages) to the LLM to refine the fix. This feedback cycle leverages natural language processing (NLP) to intelligently parse and interpret build error messages. We benchmark PLLM on the Gistable HG2.9K dataset, a collection of challenging single-file Python gists. We compare PLLM against two state-of-the-art automatic dependency inference approaches, namely PyEGo and ReadPyE, w.r.t. the ability to resolve dependency issues. Our results indicate that PLLM can fix more dependency issues than the two baselines, with +218 (+15.97%) more fixes over ReadPyE and +281 (+21.58%) over PyEGo. Our deeper analyses suggest that PLLM is particularly beneficial for projects with many dependencies and for specific third-party numerical and machine-learning modules. Our findings demonstrate the potential of LLM-based approaches to iteratively resolve Python dependency issues.
Problem

Research questions and friction points this paper is trying to address.

Automatic Repair
Python Dependencies
Version Conflicts
Innovation

Methods, ideas, or system contributions that make the work stand out.

Large Language Models
Python Dependency Resolution
Automated Code Repair
🔎 Similar Papers
No similar papers found.
A
Antony Bartlett
Delft University of Technology, The Netherlands
C
Cynthia Liem
Delft University of Technology, The Netherlands
Annibale Panichella
Annibale Panichella
Associate Professor, Delft University of Technology
Software TestingSE4AITest GenerationSBSEFuzzing