LibRec: Benchmarking Retrieval-Augmented LLMs for Library Migration Recommendations

πŸ“… 2025-08-13
πŸ“ˆ Citations: 0
✨ Influential: 0
πŸ“„ PDF
πŸ€– AI Summary
To address the low accuracy of substitute library recommendation in library migration, this paper proposes LibRec, the first retrieval-augmented generation (RAG) framework tailored to this task. Methodologically, LibRec integrates large language models (LLMs) with commit message parsing and employs in-context learning to identify fine-grained migration intents, while constructing LibEvalβ€”a large-scale, multi-dimensional benchmark dataset annotated with thousands of real-world migration records. Key contributions include: (1) an intent-driven RAG architecture that enhances both recommendation accuracy and interpretability; (2) the release of LibEval, the first dedicated evaluation benchmark for library migration; and (3) systematic evaluation across 10 mainstream LLMs, with ablation studies confirming significant performance gains from each component and robustness across diverse migration intent categories.

Technology Category

Application Category

πŸ“ Abstract
In this paper, we propose LibRec, a novel framework that integrates the capabilities of LLMs with retrieval-augmented generation(RAG) techniques to automate the recommendation of alternative libraries. The framework further employs in-context learning to extract migration intents from commit messages to enhance the accuracy of its recommendations. To evaluate the effectiveness of LibRec, we introduce LibEval, a benchmark designed to assess the performance in the library migration recommendation task. LibEval comprises 2,888 migration records associated with 2,368 libraries extracted from 2,324 Python repositories. Each migration record captures source-target library pairs, along with their corresponding migration intents and intent types. Based on LibEval, we evaluated the effectiveness of ten popular LLMs within our framework, conducted an ablation study to examine the contributions of key components within our framework, explored the impact of various prompt strategies on the framework's performance, assessed its effectiveness across various intent types, and performed detailed failure case analyses.
Problem

Research questions and friction points this paper is trying to address.

Automating library migration recommendations using LLMs and RAG
Enhancing recommendation accuracy via commit message intent extraction
Evaluating framework performance with a benchmark of 2,888 migration records
Innovation

Methods, ideas, or system contributions that make the work stand out.

Integrates LLMs with RAG for library recommendations
Uses in-context learning to extract migration intents
Introduces LibEval benchmark with 2,888 migration records
πŸ”Ž Similar Papers
No similar papers found.
Junxiao Han
Junxiao Han
Hangzhou City University
Y
Yarong Wang
Polytechnic Institute, Zhejiang University, Hangzhou, China
Xiaodong Gu
Xiaodong Gu
Associate Professor, Shanghai Jiao Tong University
Software EngineeringLarge Language Models
C
Cuiyun Gao
School of Computer Science and Technology, Harbin Institute of Technology, Shenzhen, China
Yao Wan
Yao Wan
Huazhong University of Science and Technology
NLPProgramming LanguagesSoftware EngineeringLarge Language Models
S
Song Han
School of Computer and Computing Science, Hangzhou City University, Hangzhou, China
D
David Lo
School of Computing and Information Systems, Singapore Management University, Singapore
S
Shuiguang Deng
College of Computer Science and Technology, Zhejiang University, Hangzhou 310027, China