FlexRAG: A Flexible and Comprehensive Framework for Retrieval-Augmented Generation

📅 2025-06-14
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Existing RAG frameworks suffer from poor reproducibility, delayed integration of emerging techniques, and high system overhead. To address these challenges, we propose an open-source RAG framework tailored for research and rapid prototyping, featuring the first unified architecture supporting three retrieval-augmentation paradigms: textual, multimodal, and web-based. The framework employs asynchronous I/O and a modular design, integrating vector, keyword, graph, and web retrievers; it further supports LLM adaptation, dynamic routing, and cache-aware execution, alongside full lifecycle management, asynchronous processing, and persistent caching. Extensive evaluation across multiple benchmark tasks demonstrates low latency, high throughput, and strong cross-modal generalization—significantly improving development efficiency and reproducibility of RAG systems. The implementation is publicly available.

Technology Category

Application Category

📝 Abstract
Retrieval-Augmented Generation (RAG) plays a pivotal role in modern large language model applications, with numerous existing frameworks offering a wide range of functionalities to facilitate the development of RAG systems. However, we have identified several persistent challenges in these frameworks, including difficulties in algorithm reproduction and sharing, lack of new techniques, and high system overhead. To address these limitations, we introduce extbf{FlexRAG}, an open-source framework specifically designed for research and prototyping. FlexRAG supports text-based, multimodal, and network-based RAG, providing comprehensive lifecycle support alongside efficient asynchronous processing and persistent caching capabilities. By offering a robust and flexible solution, FlexRAG enables researchers to rapidly develop, deploy, and share advanced RAG systems. Our toolkit and resources are available at href{https://github.com/ictnlp/FlexRAG}{https://github.com/ictnlp/FlexRAG}.
Problem

Research questions and friction points this paper is trying to address.

Addressing difficulties in algorithm reproduction and sharing
Overcoming lack of new techniques in RAG frameworks
Reducing high system overhead in existing RAG systems
Innovation

Methods, ideas, or system contributions that make the work stand out.

Open-source framework for RAG research
Supports multimodal and network-based RAG
Efficient async processing with persistent caching
🔎 Similar Papers
No similar papers found.
Zhuocheng Zhang
Zhuocheng Zhang
Institute of Computing Technology, Chinese Academy of Science
Natural Language Processing
Y
Yang Feng
Key Laboratory of Intelligent Information Processing, Institute of Computing Technology, Chinese Academy of Sciences (ICT/CAS); University of Chinese Academy of Sciences, China; Key Laboratory of AI Safety, Chinese Academy of Sciences
M
Min Zhang
Institute of Computing and Intelligence, Harbin Institute of Technology (Shenzhen), China