🤖 AI Summary
Non-factual question answering (NFQA) involves open-ended, multi-faceted questions with no single correct answer, requiring nuanced reasoning across dimensions such as debate, experience, or comparison—rendering conventional retrieval-augmented generation (RAG) ineffective. Method: We propose a type-aware, multi-dimensional decomposition RAG framework that first classifies questions by semantic type (e.g., debate, experiential, comparative), then decomposes each into unidimensional sub-queries to enable type-guided hierarchical retrieval and aggregation-based generation. Our end-to-end RAG architecture integrates question classification, semantic decomposition, sub-query-driven retrieval, and result fusion. Contribution/Results: We introduce Wiki-NFQA, the first large-scale benchmark for NFQA. Experiments demonstrate significant improvements over state-of-the-art RAG baselines on Wiki-NFQA, yielding more informative and contextually grounded answers. Code and dataset are publicly released.
📝 Abstract
Non-factoid question-answering (NFQA) poses a significant challenge due to its open-ended nature, diverse intents, and the need for multi-aspect reasoning, which renders conventional factoid QA approaches, including retrieval-augmented generation (RAG), inadequate. Unlike factoid questions, non-factoid questions (NFQs) lack definitive answers and require synthesizing information from multiple sources across various reasoning dimensions. To address these limitations, we introduce Typed-RAG, a type-aware multi-aspect decomposition framework within the RAG paradigm for NFQA. Typed-RAG classifies NFQs into distinct types -- such as debate, experience, and comparison -- and applies aspect-based decomposition to refine retrieval and generation strategies. By decomposing multi-aspect NFQs into single-aspect sub-queries and aggregating the results, Typed-RAG generates more informative and contextually relevant responses. To evaluate Typed-RAG, we introduce Wiki-NFQA, a benchmark dataset covering diverse NFQ types. Experimental results demonstrate that Typed-RAG outperforms baselines, thereby highlighting the importance of type-aware decomposition for effective retrieval and generation in NFQA. Our code and dataset are available at href{https://github.com/TeamNLP/Typed-RAG}{https://github.com/TeamNLP/Typed-RAG}.