Distributed Retrieval-Augmented Generation

📅 2025-05-01
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address data privacy risks and knowledge base scalability bottlenecks in centralized retrieval-augmented generation (RAG) for large language models (LLMs) on edge devices—particularly in sensitive domains like healthcare, where patient data aggregation is prohibited and dynamic knowledge updates (e.g., pandemic information) are challenging—this paper proposes Distributed RAG (DRAG). DRAG eliminates reliance on a central knowledge repository by introducing Topic-Aware Random Walk (TARW), a novel decentralized retrieval algorithm that jointly leverages LLM-based topic extraction, peer-to-peer network topology modeling, and distributed query routing. Evaluated across three diverse datasets and multiple LLM backbones, DRAG achieves performance comparable to centralized RAG while reducing inter-node communication overhead by 50%. Moreover, it significantly enhances data privacy preservation and system scalability, enabling secure, adaptive, and resource-efficient LLM deployment at the edge.

Technology Category

Application Category

📝 Abstract
As large language models (LLMs) become increasingly adopted on edge devices, Retrieval-Augmented Generation (RAG) is gaining prominence as a solution to address factual deficiencies and hallucinations by integrating external knowledge. However, centralized RAG architectures face significant challenges in data privacy and scalability. For instance, smart healthcare services often rely on collecting sensitive patient data and building a centralized knowledge base to provide better diagnosis and treatment advice, while privacy concerns significantly impede this process. Besides, maintaining a comprehensive and continuously updated knowledge base is costly, particularly in response to regional epidemics and rapidly mutating viruses. To address these challenges, this paper introduces Distributed Retrieval-Augmented Generation (DRAG), a novel framework that improves data privacy by eliminating the need for a centralized knowledge base and restoring data control to owners. DRAG incorporates a Topic-Aware Random Walk (TARW) algorithm that leverages LLMs to extract query topics and facilitate targeted peer discovery within a peer-to-peer network, enabling efficient knowledge retrieval in decentralized environments. Extensive experiments across three diverse datasets and LLMs demonstrate that DRAG with TARW achieves near-centralized RAG performance by using half as many messages as flooding. The code is available at https://github.com/xuchenhao001/DRAG.
Problem

Research questions and friction points this paper is trying to address.

Addresses privacy and scalability in centralized RAG architectures
Eliminates centralized knowledge base for better data control
Enables efficient decentralized knowledge retrieval using P2P networks
Innovation

Methods, ideas, or system contributions that make the work stand out.

Distributed RAG framework enhances data privacy
Topic-Aware Random Walk algorithm for peer discovery
Decentralized knowledge retrieval with near-centralized performance
🔎 Similar Papers
No similar papers found.