🤖 AI Summary
Global meaning construction—i.e., cross-corpus holistic information integration and reasoning—faces challenges including retrieval incompleteness, topic over-generalization, and high inference overhead. To address these, we propose ReTAG: a framework that first constructs topic-enhanced subgraphs via topic modeling, then integrates retrieval mechanisms for precise key-information localization, and finally leverages synergistic graph neural networks and generative models to perform multi-hop reasoning and answer synthesis. Compared to existing graph-based approaches, ReTAG achieves state-of-the-art performance on benchmarks such as HotpotQA, significantly improving multi-hop question answering accuracy while reducing inference latency by 42%–68%. Its core contribution lies in the first deep coupling of topic-aware subgraph construction with retrieval augmentation—effectively balancing semantic focus and computational efficiency—and establishing a scalable new paradigm for global reasoning over large-scale corpora.
📝 Abstract
Recent advances in question answering have led to substantial progress in tasks such as multi-hop reasoning. However, global sensemaking-answering questions by synthesizing information from an entire corpus remains a significant challenge. A prior graph-based approach to global sensemaking lacks retrieval mechanisms, topic specificity, and incurs high inference costs. To address these limitations, we propose ReTAG, a Retrieval-Enhanced, Topic-Augmented Graph framework that constructs topic-specific subgraphs and retrieves the relevant summaries for response generation. Experiments show that ReTAG improves response quality while significantly reducing inference time compared to the baseline. Our code is available at https://github.com/bykimby/retag.