Query Optimization for Parametric Knowledge Refinement in Retrieval-Augmented Large Language Models

📅 2024-11-12

🏛️ arXiv.org

📈 Citations: 1

✨ Influential: 0

career value

161K/year

🤖 AI Summary

To address the information gap in the pre-retrieval phase of Retrieval-Augmented Generation (RAG)—where coarse-grained queries fail to align with large language models’ (LLMs) fine-grained knowledge demands—this paper proposes ERRR, a novel “Extract–Refine–Retrieve–Read” four-stage framework. ERRR introduces parameterized knowledge extraction and differentiable query refinement to distill task-specific prior knowledge from LLMs, thereby guiding a lightweight, trainable query optimizer. It further integrates multi-source retrieval and efficient fine-tuning of lightweight models. Evaluated across multiple open-domain QA benchmarks and heterogeneous retrieval systems, ERRR achieves substantial improvements: +5.2–9.8% absolute gain in answer accuracy and +12.4% in Recall@5, while demonstrating strong generalization, low computational overhead, and rapid deployment capability.

Technology Category

Application Category

📝 Abstract

We introduce the Extract-Refine-Retrieve-Read (ERRR) framework, a novel approach designed to bridge the pre-retrieval information gap in Retrieval-Augmented Generation (RAG) systems through query optimization tailored to meet the specific knowledge requirements of Large Language Models (LLMs). Unlike conventional query optimization techniques used in RAG, the ERRR framework begins by extracting parametric knowledge from LLMs, followed by using a specialized query optimizer for refining these queries. This process ensures the retrieval of only the most pertinent information essential for generating accurate responses. Moreover, to enhance flexibility and reduce computational costs, we propose a trainable scheme for our pipeline that utilizes a smaller, tunable model as the query optimizer, which is refined through knowledge distillation from a larger teacher model. Our evaluations on various question-answering (QA) datasets and with different retrieval systems show that ERRR consistently outperforms existing baselines, proving to be a versatile and cost-effective module for improving the utility and accuracy of RAG systems.

Problem

Research questions and friction points this paper is trying to address.

Bridge pre-retrieval information gap in RAG systems

Optimize queries for LLMs' specific knowledge requirements

Retrieve only most pertinent information for accurate responses

Innovation

Methods, ideas, or system contributions that make the work stand out.

Extracts parametric knowledge from LLMs

Uses specialized query optimizer for refinement

Employs knowledge distillation for cost efficiency

🔎 Similar Papers

Ask Optimal Questions: Aligning Large Language Models with Retriever's Preference in Conversational Search