π€ AI Summary
This work addresses the challenge in knowledge graph question answering (KGQA) of simultaneously achieving computational efficiency and effective modeling of structured constraints. While specialized small models are efficient but often overlook implicit constraints, general-purpose large language models (LLMs) offer higher accuracy at substantial inference cost. To bridge this gap, we propose RouterKGQA, a novel framework featuring a dynamic routing mechanism between specialized and general models. The specialized model generates initial reasoning paths, and the LLM is invoked only when necessary to perform knowledge graphβguided correction. Additionally, a constraint-aware answer filtering module refines final predictions. Evaluated across multiple benchmarks, our approach achieves an average F1 improvement of 3.57 and Hits@1 gain of 0.49 over state-of-the-art methods, requiring merely 1.15 LLM calls per question.
π Abstract
Knowledge graph question answering (KGQA) is a promising approach for mitigating LLM hallucination by grounding reasoning in structured and verifiable knowledge graphs. Existing approaches fall into two paradigms: retrieval-based methods utilize small specialized models, which are efficient but often produce unreachable paths and miss implicit constraints, while agent-based methods utilize large general models, which achieve stronger structural grounding at substantially higher cost. We propose RouterKGQA, a framework for specialized--general model collaboration, in which a specialized model generates reasoning paths and a general model performs KG-guided repair only when needed, improving performance at minimal cost. We further equip the specialized with constraint-aware answer filtering, which reduces redundant answers. In addition, we design a more efficient general agent workflow, further lowering inference cost. Experimental results show that RouterKGQA outperforms the previous best by 3.57 points in F1 and 0.49 points in Hits@1 on average across benchmarks, while requiring only 1.15 average LLM calls per question. Codes and models are available at https://github.com/Oldcircle/RouterKGQA.