🤖 AI Summary
Large language models (LLMs) suffer from difficulties in integrating novel and prior knowledge, alongside catastrophic forgetting. Inspired by the hippocampal indexing theory, this work formally maps the mammalian dual-memory system—comprising neocortical long-term storage and hippocampal rapid indexing—onto a biologically grounded retrieval-augmented generation (RAG) architecture. The framework synergistically integrates LLMs, knowledge graphs, and personalized PageRank to realize neurobiologically motivated memory division of labor. It enables single-step retrieval with performance competitive with iterative methods, achieving 6–13× speedup over IRCoT and 10–30× cost reduction. On multi-hop question answering, it surpasses state-of-the-art approaches by up to 20% in accuracy. Moreover, it uniquely supports dynamic incremental learning and cross-document reasoning—capabilities unsupported by existing RAG methodologies.
📝 Abstract
In order to thrive in hostile and ever-changing natural environments, mammalian brains evolved to store large amounts of knowledge about the world and continually integrate new information while avoiding catastrophic forgetting. Despite the impressive accomplishments, large language models (LLMs), even with retrieval-augmented generation (RAG), still struggle to efficiently and effectively integrate a large amount of new experiences after pre-training. In this work, we introduce HippoRAG, a novel retrieval framework inspired by the hippocampal indexing theory of human long-term memory to enable deeper and more efficient knowledge integration over new experiences. HippoRAG synergistically orchestrates LLMs, knowledge graphs, and the Personalized PageRank algorithm to mimic the different roles of neocortex and hippocampus in human memory. We compare HippoRAG with existing RAG methods on multi-hop question answering and show that our method outperforms the state-of-the-art methods remarkably, by up to 20%. Single-step retrieval with HippoRAG achieves comparable or better performance than iterative retrieval like IRCoT while being 10-30 times cheaper and 6-13 times faster, and integrating HippoRAG into IRCoT brings further substantial gains. Finally, we show that our method can tackle new types of scenarios that are out of reach of existing methods. Code and data are available at https://github.com/OSU-NLP-Group/HippoRAG.