🤖 AI Summary
Existing RAG methods rely on flat knowledge retrieval, overlooking the inherent hierarchical structure of human cognition—thereby limiting semantic understanding and generalization in domain-specific tasks.
Method: We propose HiRAG, the first framework to systematically model and leverage knowledge hierarchy for RAG enhancement. It constructs a graph-based hierarchical knowledge index during indexing and introduces a multi-granularity routing mechanism during retrieval to enable semantics-aware, layered matching, tightly integrated with LLM-based generation.
Contribution/Results: HiRAG breaks away from the conventional flat paradigm, substantially improving complex reasoning and cross-granularity generalization. It achieves state-of-the-art performance across multiple domain-specific benchmarks, outperforming all existing RAG methods. The implementation is publicly available.
📝 Abstract
Graph-based Retrieval-Augmented Generation (RAG) methods have significantly enhanced the performance of large language models (LLMs) in domain-specific tasks. However, existing RAG methods do not adequately utilize the naturally inherent hierarchical knowledge in human cognition, which limits the capabilities of RAG systems. In this paper, we introduce a new RAG approach, called HiRAG, which utilizes hierarchical knowledge to enhance the semantic understanding and structure capturing capabilities of RAG systems in the indexing and retrieval processes. Our extensive experiments demonstrate that HiRAG achieves significant performance improvements over the state-of-the-art baseline methods. The code of our proposed method is available at href{https://github.com/hhy-huang/HiRAG}{https://github.com/hhy-huang/HiRAG}.