FedRAG: A Framework for Fine-Tuning Retrieval-Augmented Generation Systems

📅 2025-06-10
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Current RAG systems lack a unified framework supporting end-to-end, joint fine-tuning of retrievers and generators under both centralized and federated architectures, while existing toolchains struggle to accommodate privacy-sensitive and data-isolated scenarios. This paper introduces FedRAG—the first open-source framework enabling dual-mode (centralized/federated) end-to-end RAG fine-tuning, jointly optimizing dual-encoder retrievers and generator models via supervised fine-tuning. Its key contributions are: (1) a unified architectural abstraction with one-click mode switching; (2) deep integration of mainstream RAG libraries (e.g., LlamaIndex, Haystack), enabling modular, customizable pipelines; and (3) significant improvements in factual consistency and response quality across cross-domain, low-resource, and distributed settings. Experiments demonstrate FedRAG’s superior training efficiency and robust performance across multiple benchmarks, while seamlessly integrating with existing RAG workflows.

Technology Category

Application Category

📝 Abstract
Retrieval-augmented generation (RAG) systems have been shown to be effective in addressing many of the drawbacks of relying solely on the parametric memory of large language models. Recent work has demonstrated that RAG systems can be improved via fine-tuning of their retriever and generator models. In this work, we introduce FedRAG, a framework for fine-tuning RAG systems across centralized and federated architectures. FedRAG supports state-of-the-art fine-tuning methods, offering a simple and intuitive interface and a seamless conversion from centralized to federated training tasks. FedRAG is also deeply integrated with the modern RAG ecosystem, filling a critical gap in available tools.
Problem

Research questions and friction points this paper is trying to address.

Fine-tuning RAG systems across centralized and federated architectures
Improving retriever and generator models in RAG systems
Integrating modern RAG ecosystem with fine-tuning tools
Innovation

Methods, ideas, or system contributions that make the work stand out.

Fine-tunes RAG systems across architectures
Supports state-of-the-art fine-tuning methods
Integrates deeply with modern RAG ecosystem
🔎 Similar Papers
No similar papers found.
V
Val Andrei Fajardo
Vector Institute, Toronto ON M5G 0C6, Canada
D
David B. Emerson
Vector Institute, Toronto ON M5G 0C6, Canada
A
Amandeep Singh
Vector Institute, Toronto ON M5G 0C6, Canada
Veronica Chatrath
Veronica Chatrath
Technical Program Manager | Vector Institute
Marcelo Lotif
Marcelo Lotif
Senior Software Developer, Vector Institute
Machine LearningArtificial Intelligence
R
Ravi Theja
Independent Researcher, Toronto, Canada
A
Alex Cheung
Independent Researcher, Toronto, Canada
I
Izuki Matsubi
Independent Researcher, Toronto, Canada