Retrieval with Learned Similarities

📅 2024-07-22
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Efficient top-k retrieval remains challenging in neural retrieval due to the computational intractability of complex, non-dot-product similarity functions. Method: This paper proposes Mixture-of-Logits (MoL), a generic differentiable similarity approximator that relaxes the dot-product constraint; introduces a novel mutual information maximization-based load-balancing loss to mitigate subspace overload; and designs an approximate Maximum Inner Product Search (MIPS) algorithm with tight theoretical error bounds, enabling joint optimization of multi-embedding queries and ID decoding. Contributions/Results: MoL achieves state-of-the-art performance across heterogeneous tasks—including recommendation sequence retrieval and QA fine-tuning—while reducing top-k retrieval latency by up to 66× and maintaining recall >0.99. The framework significantly enhances both expressivity and real-time capability of retrieval systems in the large-model era.

Technology Category

Application Category

📝 Abstract
Retrieval plays a fundamental role in recommendation systems, search, and natural language processing (NLP) by efficiently finding relevant items from a large corpus given a query. Dot products have been widely used as the similarity function in such tasks, enabled by Maximum Inner Product Search (MIPS) algorithms for efficient retrieval. However, state-of-the-art retrieval algorithms have migrated to learned similarities. These advanced approaches encompass multiple query embeddings, complex neural networks, direct item ID decoding via beam search, and hybrid solutions. Unfortunately, we lack efficient solutions for retrieval in these state-of-the-art setups. Our work addresses this gap by investigating efficient retrieval techniques with expressive learned similarity functions. We establish Mixture-of-Logits (MoL) as a universal approximator of similarity functions, demonstrate that MoL's expressiveness can be realized empirically to achieve superior performance on diverse retrieval scenarios, and propose techniques to retrieve the approximate top-k results using MoL with tight error bounds. Through extensive experimentation, we show that MoL, enhanced by our proposed mutual information-based load balancing loss, sets new state-of-the-art results across heterogeneous scenarios, including sequential retrieval models in recommendation systems and finetuning language models for question answering; and our approximate top-$k$ algorithms outperform baselines by up to 66x in latency while achieving>.99 recall rate compared to exact algorithms.
Problem

Research questions and friction points this paper is trying to address.

Similarity Learning
Information Retrieval
Efficiency
Innovation

Methods, ideas, or system contributions that make the work stand out.

Hybrid Log Odds (MoL)
Efficient Retrieval Technique
Fast Relevance Search Algorithm
🔎 Similar Papers
Bailu Ding
Bailu Ding
Microsoft Research
DatabaseSystem
J
Jiaqi Zhai
Meta, Bellevue, Washington, USA