🤖 AI Summary
This work addresses the limitations of existing retrieval-augmented generation (RAG) approaches in multi-hop question answering, where single-step retrieval lacks reasoning capability and graph-based methods suffer from high noise and computational overhead. To overcome these challenges, the authors propose MultiCube-RAG, a training-free framework that introduces an ontology-driven, multi-dimensional orthogonal cube structure. This architecture decomposes complex queries into sub-queries along semantic dimensions and dynamically invokes specialized cubes for stepwise retrieval and reasoning. The method enables modular and interpretable multi-hop inference while avoiding graph-induced noise and eliminating training costs. Evaluated on four multi-hop QA benchmarks, MultiCube-RAG achieves an average accuracy improvement of 8.9%, demonstrating both high efficiency and intrinsic interpretability.
📝 Abstract
Multi-hop question answering (QA) necessitates multi-step reasoning and retrieval across interconnected subjects, attributes, and relations. Existing retrieval-augmented generation (RAG) methods struggle to capture these structural semantics accurately, resulting in suboptimal performance. Graph-based RAGs structure such information in graphs, but the resulting graphs are often noisy and computationally expensive. Moreover, most methods rely on single-step retrieval, neglecting the need for multi-hop reasoning processes. Recent training-based approaches attempt to incentivize the large language models (LLMs) for iterative reasoning and retrieval, but their training processes are prone to unstable convergence and high computational overhead. To address these limitations, we devise an ontology-based cube structure with multiple and orthogonal dimensions to model structural subjects, attributes, and relations. Built on the cube structure, we propose MultiCube-RAG, a training-free method consisting of multiple cubes for multi-step reasoning and retrieval. Each cube specializes in modeling a class of subjects, so that MultiCube-RAG flexibly selects the most suitable cubes to acquire the relevant knowledge precisely. To enhance the query-based reasoning and retrieval, our method decomposes a complex multi-hop query into a set of simple subqueries along cube dimensions and conquers each of them sequentially. Experiments on four multi-hop QA datasets show that MultiCube-RAG improves response accuracy by 8.9% over the average performance of various baselines. Notably, we also demonstrate that our method performs with greater efficiency and inherent explainability.