🤖 AI Summary
Python third-party library upgrades frequently trigger version compatibility issues (VCIs) and code compatibility issues (CCIs). Existing tools detect only dependency conflicts, lacking joint reasoning capabilities for code-level incompatibilities—such as API changes, missing modules, or altered signatures. This paper proposes the first end-to-end automated approach that jointly models VCIs and CCIs by integrating version compatibility assessment with fine-grained code analysis—including API call identification, module dependency inference, missing-library completion, and four additional specialized modules—to generate compatible `requirements.txt` files and interpretable repair reports. Its key innovation lies in unifying VCI and CCI detection and enabling fully automated dependency upgrade repair. Evaluated on our large-scale, manually curated benchmark REQBench, the method achieves a 94.03% inference success rate with an average runtime of 60.79 seconds, significantly outperforming PyEGo, ReadPyE, and state-of-the-art LLM-based approaches.
📝 Abstract
Python third-party libraries (TPLs) are essential in modern software development, but upgrades often cause compatibility issues, leading to system failures. These issues fall into two categories: version compatibility issues (VCIs) and code compatibility issues (CCIs). Existing tools mainly detect dependency conflicts but overlook code-level incompatibilities, with no solution fully automating the inference of compatible versions for both VCIs and CCIs. To fill this gap, we propose PCREQ, the first approach to automatically infer compatible requirements by combining version and code compatibility analysis. PCREQ integrates six modules: knowledge acquisition, version compatibility assessment, invoked APIs and modules extraction, code compatibility assessment, version change, and missing TPL completion. PCREQ collects candidate versions, checks for conflicts, identifies API usage, evaluates code compatibility, and iteratively adjusts versions to generate a compatible requirements.txt with a detailed repair report. To evaluate PCREQ, we construct REQBench, a large-scale benchmark with 2,095 upgrade test cases (including 406 unsolvable by pip). Results show PCREQ achieves a 94.03% inference success rate, outperforming PyEGo (37.02%), ReadPyE (37.16%), and LLM-based approaches (GPT-4o, DeepSeek V3/R1) by 18-20%. PCREQ processes each case from REQBench in 60.79s on average, demonstrating practical efficiency. PCREQ significantly reduces manual effort in troubleshooting upgrades, advancing Python dependency maintenance automation.