🤖 AI Summary
This work addresses the limitations of existing graph-structured retrieval-augmented generation (RAG) methods, which often disrupt local graph topology through manually defined compression quotas and over-rely on structural cues at the expense of semantic coherence. To overcome these issues, the authors propose reframing attributed graph retrieval as a hierarchical retrieval framework grounded in encoding trees, employing a joint semantics-and-structure-guided adaptive compression strategy to enable both organized knowledge structuring and efficient retrieval. A key innovation is the introduction of a Semantic–Structural Entropy (S²-Entropy) metric that drives globally optimized hierarchical partitioning, preserving topological integrity while enhancing semantic consistency. Experimental results demonstrate that the proposed approach significantly outperforms current RAG methods across multiple graph reasoning benchmarks, yielding more contextually relevant and logically coherent responses to complex queries.
📝 Abstract
Retrieval-Augmented Generation (RAG) has significantly enhanced Large Language Models'ability to access external knowledge, yet current graph-based RAG approaches face two critical limitations in managing hierarchical information: they impose rigid layer-specific compression quotas that damage local graph structures, and they prioritize topological structure while neglecting semantic content. We introduce T-Retriever, a novel framework that reformulates attributed graph retrieval as tree-based retrieval using a semantic and structure-guided encoding tree. Our approach features two key innovations: (1) Adaptive Compression Encoding, which replaces artificial compression quotas with a global optimization strategy that preserves the graph's natural hierarchical organization, and (2) Semantic-Structural Entropy ($S^2$-Entropy), which jointly optimizes for both structural cohesion and semantic consistency when creating hierarchical partitions. Experiments across diverse graph reasoning benchmarks demonstrate that T-Retriever significantly outperforms state-of-the-art RAG methods, providing more coherent and contextually relevant responses to complex queries.