🤖 AI Summary
Large language models (LLMs) suffer from weak reasoning, outdated knowledge, and prominent hallucination in complex question answering (QA). Method: This paper systematically investigates LLM–knowledge graph (KG) synergistic enhancement mechanisms, proposing the first structured taxonomy for LLM–KG integration tailored to QA—categorizing approaches along two dimensions: QA type and KG role. It explicitly aligns complex QA challenges (e.g., multi-hop reasoning, temporal sensitivity, explainability) with corresponding KG-augmentation strategies (e.g., retrieval-augmented generation, graph neural networks, neuro-symbolic hybrid modeling, multi-hop KG traversal). Contribution/Results: The work provides a unified analysis of state-of-the-art methods across factual accuracy, temporal freshness, explainability, and complex reasoning capabilities, delineating their effectiveness boundaries. It consolidates evaluation criteria, benchmark datasets, and identifies key open research directions for robust, trustworthy, and temporally aware LLM–KG integration in QA.
📝 Abstract
Large language models (LLMs) have demonstrated remarkable performance on question-answering (QA) tasks because of their superior capabilities in natural language understanding and generation. However, LLM-based QA struggles with complex QA tasks due to poor reasoning capacity, outdated knowledge, and hallucinations. Several recent works synthesize LLMs and knowledge graphs (KGs) for QA to address the above challenges. In this survey, we propose a new structured taxonomy that categorizes the methodology of synthesizing LLMs and KGs for QA according to the categories of QA and the KG's role when integrating with LLMs. We systematically survey state-of-the-art advances in synthesizing LLMs and KGs for QA and compare and analyze these approaches in terms of strength, limitations, and KG requirements. We then align the approaches with QA and discuss how these approaches address the main challenges of different complex QA. Finally, we summarize the advancements, evaluation metrics, and benchmark datasets and highlight open challenges and opportunities.