DiSCo: LLM Knowledge Distillation for Efficient Sparse Retrieval in Conversational Search

📅 2024-10-18

📈 Citations: 1

✨ Influential: 0

career value

187K/year

🤖 AI Summary

To address the inefficiency and weak contextual modeling of LLM-driven sparse retrieval in conversational search, this paper proposes a relaxed knowledge distillation framework. Departing from conventional representation-level distillation, it introduces a novel similarity-level relaxed distillation objective that directly transfers soft dialogue–document similarity scores generated by multiple large language model teachers (e.g., GPT-4, Claude). The framework jointly optimizes contrastive learning and distillation losses while explicitly enforcing sparsity constraints to enable controllable sparsification. Built upon the Learnable Sparse Retrieval (LSR) architecture, the method is evaluated across five standard benchmarks. Results show up to a 6.0% improvement in out-of-domain Recall, consistently surpassing single-teacher baselines. Moreover, it supports fine-grained sparsity control, effectively balancing retrieval accuracy and computational efficiency.

Technology Category

Application Category

📝 Abstract

Conversational Search (CS) involves retrieving relevant documents from a corpus while considering the conversational context, integrating retrieval with context modeling. Recent advancements in Large Language Models (LLMs) have significantly enhanced CS by enabling query rewriting based on conversational context. However, employing LLMs during inference poses efficiency challenges. Existing solutions mitigate this issue by distilling embeddings derived from human-rewritten queries, focusing primarily on learning the context modeling task. These methods, however, often separate the contrastive retrieval task from the distillation process, treating it as an independent loss term. To overcome these limitations, we introduce DiSCo (Distillation of Sparse Conversational retrieval), a novel approach that unifies retrieval and context modeling through a relaxed distillation objective. Instead of relying exclusively on representation learning, our method distills similarity scores between conversations and documents, providing more freedom in the representation space and better leveraging the contrastive nature of document relevance. Extensive experiments on Learned Sparse Retrieval (LSR) across five CS datasets demonstrate that DiSCo achieves substantial improvements in both in-domain and out-of-domain retrieval tasks, achieving up to a six-point gain in recall for out-of-domain datasets over state-of-the-art methods. Additionally, DiSCo employs a multi-teacher distillation strategy, using multiple LLMs as teachers, further enhancing performance and surpassing the individual teachers in in-domain settings. Furthermore, analysis of model sparsity reveals that DiSCo allows for more effective control over the sparsity of the trained models.

Problem

Research questions and friction points this paper is trying to address.

Improving efficiency of LLMs in conversational search retrieval

Unifying retrieval and context modeling via distillation

Enhancing sparse retrieval performance across diverse datasets

Innovation

Methods, ideas, or system contributions that make the work stand out.

Unifies retrieval and context modeling via relaxed distillation

Distills similarity scores for better document relevance

Uses multi-teacher strategy with multiple LLMs

🔎 Similar Papers

No similar papers found.