🤖 AI Summary
Existing graph-augmented RAG systems rely on a single retriever, limiting their ability to handle complex queries and rendering them vulnerable to noisy or irrelevant information. To address these limitations, we propose MoE-GraphRAG, a dynamic routing framework based on a Mixture of Experts (MoE) architecture. Our method integrates a query-aware graph encoder with multi-task specialized training to jointly model multiple semantic dimensions—including entities, relations, and subgraph structures. A learnable dynamic routing mechanism adaptively selects and combines expert retrievers during retrieval, simultaneously performing relevance scoring and noise suppression. Evaluated on multiple knowledge-intensive graph reasoning benchmarks, MoE-GraphRAG achieves state-of-the-art performance, significantly outperforming existing baselines. It demonstrates strong generalization across diverse query types and exhibits practical deployability due to its modular, scalable design.
📝 Abstract
Large Language Models (LLMs) have achieved impressive performance across a wide range of applications. However, they often suffer from hallucinations in knowledge-intensive domains due to their reliance on static pretraining corpora. To address this limitation, Retrieval-Augmented Generation (RAG) enhances LLMs by incorporating external knowledge sources during inference. Among these sources, textual graphs provide structured and semantically rich information that supports more precise and interpretable reasoning. This has led to growing interest in graph-based RAG systems. Despite their potential, most existing approaches rely on a single retriever to identify relevant subgraphs, which limits their ability to capture the diverse aspects of complex queries. Moreover, these systems often struggle to accurately judge the relevance of retrieved content, making them prone to distraction by irrelevant noise. To address these challenges, in this paper, we propose MIXRAG, a Mixture-of-Experts Graph-RAG framework that introduces multiple specialized graph retrievers and a dynamic routing controller to better handle diverse query intents. Each retriever is trained to focus on a specific aspect of graph semantics, such as entities, relations, or subgraph topology. A Mixture-of-Experts module adaptively selects and fuses relevant retrievers based on the input query. To reduce noise in the retrieved information, we introduce a query-aware GraphEncoder that carefully analyzes relationships within the retrieved subgraphs, highlighting the most relevant parts while down-weighting unnecessary noise. Empirical results demonstrate that our method achieves state-of-the-art performance and consistently outperforms various baselines. MIXRAG is effective across a wide range of graph-based tasks in different domains. The code will be released upon paper acceptance.