🤖 AI Summary
This work addresses the high retrieval cost and fragmentation in existing Retrieval-Augmented Generation (RAG) methods, which stem from misaligned memory organization. Inspired by human intuitive reasoning, the authors propose an intuition-guided RAG framework that aligns multi-granularity knowledge through a hierarchical heterogeneous hypergraph. A query parser dynamically adjusts the depth and window of memory exploration, while a dual-focus retrieval mechanism activates immediate memory anchors. Bidirectional diffusion algorithms further simulate deductive reasoning paths. This approach is the first to integrate human-like reasoning mechanisms into memory structure and resource allocation, enabling task-complexity-adaptive efficient retrieval. It outperforms the strongest baseline by 4.8% in EM and 5.0% in F1, while substantially reducing token consumption—averaging 6.3k tokens and as low as 3.0k.
📝 Abstract
Retrieval-augmented generation (RAG) equips large language models (LLMs) with reliable knowledge memory. To strengthen cross-text associations, recent research integrates graphs and hypergraphs into RAG to capture pairwise and multi-entity relations as structured links. However, their misaligned memory organization necessitates costly, disjointed retrieval. To address these limitations, we propose IGMiRAG, a framework inspired by human intuition-guided reasoning. It constructs a hierarchical heterogeneous hypergraph to align multi-granular knowledge, incorporating deductive pathways to simulate realistic memory structures. During querying, IGMiRAG distills intuitive strategies via a question parser to control mining depth and memory window, and activates instantaneous memories as anchors using dual-focus retrieval. Mirroring human intuition, the framework guides retrieval resource allocation dynamically. Furthermore, we design a bidirectional diffusion algorithm that navigates deductive paths to mine in-depth memories, emulating human reasoning processes. Extensive evaluations indicate IGMiRAG outperforms the state-of-the-art baseline by 4.8% EM and 5.0% F1 overall, with token costs adapting to task complexity (average 6.3k+, minimum 3.0k+). This work presents a cost-effective RAG paradigm that improves both efficiency and effectiveness.