RETRO-LI: Small-Scale Retrieval Augmented Generation Supporting Noisy Similarity Searches and Domain Shift Generalization

📅 2024-09-12

🏛️ European Conference on Artificial Intelligence

📈 Citations: 0

✨ Influential: 0

career value

183K/year

🤖 AI Summary

Retro-li addresses the challenges of inaccurate retrieval, noise sensitivity, and poor cross-domain generalization in RAG systems with small-scale non-parametric memory banks—stemming from data sparsity. It proposes a lightweight, robust retrieval-augmented generation framework. Its core contributions are: (1) the first introduction of non-parametric memory regularization, substantially improving semantic retrieval robustness under noisy conditions and generalization capability under domain shift; and (2) an in-memory-computing-friendly architecture enabling O(1) constant-time retrieval. Experiments demonstrate that Retro-li maintains high retrieval accuracy on small memory banks, achieves significantly lower perplexity than baselines, delivers marked performance gains on cross-domain tasks, and incurs less than 1% accuracy degradation due to retrieval noise in hardware simulations.

Technology Category

Application Category

📝 Abstract

The retrieval augmented generation (RAG) system such as Retro has been shown to improve language modeling capabilities and reduce toxicity and hallucinations by retrieving from a database of non-parametric memory containing trillions of entries. We introduce Retro-li that shows retrieval can also help using a small-scale database, but it demands more accurate and better neighbors when searching in a smaller hence sparser non-parametric memory. This can be met by using a proper semantic similarity search. We further propose adding a regularization to the non-parametric memory for the first time: it significantly reduces perplexity when the neighbor search operations are noisy during inference, and it improves generalization when a domain shift occurs. We also show that Retro-li's non-parametric memory can potentially be implemented on analog in-memory computing hardware, exhibiting O(1) search time while causing noise in retrieving neighbors, with minimal (<1%) performance loss. Our code is available at: https://github.com/IBM/Retrieval-Enhanced-Transformer-Little.

Problem

Research questions and friction points this paper is trying to address.

Enhance small-scale RAG with accurate similarity searches

Reduce noise impact in retrieval via memory regularization

Enable efficient O(1) hardware search with minimal performance loss

Innovation

Methods, ideas, or system contributions that make the work stand out.

Small-scale RAG with accurate semantic search

Regularization reduces noise and domain shift

Analog in-memory computing enables O(1) search

🔎 Similar Papers

UniRAG: Universal Retrieval Augmentation for Large Vision Language Models