LibRec: Benchmarking Retrieval-Augmented LLMs for Library Migration Recommendations

📅 2025-08-13

📈 Citations: 0

✨ Influential: 0

career value

187K/year

🤖 AI Summary

To address the low accuracy of substitute library recommendation in library migration, this paper proposes LibRec, the first retrieval-augmented generation (RAG) framework tailored to this task. Methodologically, LibRec integrates large language models (LLMs) with commit message parsing and employs in-context learning to identify fine-grained migration intents, while constructing LibEval—a large-scale, multi-dimensional benchmark dataset annotated with thousands of real-world migration records. Key contributions include: (1) an intent-driven RAG architecture that enhances both recommendation accuracy and interpretability; (2) the release of LibEval, the first dedicated evaluation benchmark for library migration; and (3) systematic evaluation across 10 mainstream LLMs, with ablation studies confirming significant performance gains from each component and robustness across diverse migration intent categories.

Technology Category

Application Category

📝 Abstract

In this paper, we propose LibRec, a novel framework that integrates the capabilities of LLMs with retrieval-augmented generation(RAG) techniques to automate the recommendation of alternative libraries. The framework further employs in-context learning to extract migration intents from commit messages to enhance the accuracy of its recommendations. To evaluate the effectiveness of LibRec, we introduce LibEval, a benchmark designed to assess the performance in the library migration recommendation task. LibEval comprises 2,888 migration records associated with 2,368 libraries extracted from 2,324 Python repositories. Each migration record captures source-target library pairs, along with their corresponding migration intents and intent types. Based on LibEval, we evaluated the effectiveness of ten popular LLMs within our framework, conducted an ablation study to examine the contributions of key components within our framework, explored the impact of various prompt strategies on the framework's performance, assessed its effectiveness across various intent types, and performed detailed failure case analyses.

Problem

Research questions and friction points this paper is trying to address.

Automating library migration recommendations using LLMs and RAG

Enhancing recommendation accuracy via commit message intent extraction

Evaluating framework performance with a benchmark of 2,888 migration records

Innovation

Methods, ideas, or system contributions that make the work stand out.

Integrates LLMs with RAG for library recommendations

Uses in-context learning to extract migration intents

Introduces LibEval benchmark with 2,888 migration records

🔎 Similar Papers

No similar papers found.