FD-RAG: Federated Dual-System Retrieval-Augmented Generation

๐Ÿ“… 2026-05-22
๐Ÿ“ˆ Citations: 0
โœจ Influential: 0
๐Ÿ“„ PDF
๐Ÿค– AI Summary
This work addresses the challenges of deploying conventional centralized Retrieval-Augmented Generation (RAG) in edge computing environments, where data privacy constraints, device heterogeneity, and the high cost of large language model (LLM) invocation hinder practicality. To overcome these limitations, the authors propose a federated dual-path RAG framework that constructs semantic-aware adaptive hypergraphs locally to encode knowledge structures and distills them into compact question-answering memory. LLMs are invoked only when necessary, decoupling lightweight retrieval from heavyweight reasoning. This approach uniquely integrates federated learning with dual-path RAG, incorporating hypergraph modeling and memory distillation to mitigate cross-device knowledge fragmentation while preserving privacy. Experiments demonstrate up to a 7.8% improvement in question-answering accuracy and an 8.4ร— reduction in latency, with theoretical analysis establishing an ๐’ช(1/ฮตยฒ) convergence rate for the hypergraph learning component.
๐Ÿ“ Abstract
Retrieval-augmented generation (RAG) has emerged as a paradigm for grounding large language models in external knowledge, yet most existing RAG systems assume centralized knowledge access and ample computation. These assumptions break down in edge environments, where knowledge is fragmented across devices, raw data cannot be shared, and repeated LLM calls are prohibitively expensive. We propose FD-RAG, a federated dual-system RAG framework that decouples lightweight memory access from on-demand LLM reasoning for decentralized deployment. Specifically, FD-RAG learns semantic-aware adaptive hypergraphs over local corpora and distills them into compact QA memories. At inference time, it answers well-covered queries via direct memory matching and invokes LLM-based reasoning only when necessary, while tracing retrieved memories to hypergraph-grounded evidence. To mitigate cross-device knowledge fragmentation, FD-RAG aggregates anonymized memories across devices without exposing raw documents. Experiments on QA benchmarks show that FD-RAG improves accuracy by up to 7.8\% while reducing latency by 8.4$\times$ compared with strong local and federated baselines. We also provide theoretical analysis establishing an $\mathcal{O}(1/ฮต^{2})$ convergence rate for the proposed hypergraph learning, supporting its tractable deployment in edge settings.
Problem

Research questions and friction points this paper is trying to address.

Retrieval-Augmented Generation
Federated Learning
Edge Computing
Knowledge Fragmentation
Large Language Models
Innovation

Methods, ideas, or system contributions that make the work stand out.

Federated RAG
Hypergraph Learning
Memory Distillation
Decentralized LLM
Edge AI