UAQFact: Evaluating Factual Knowledge Utilization of LLMs on Unanswerable Questions

📅 2025-05-29
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Existing Unanswerable Question (UAQ) evaluation datasets lack factual knowledge grounding, hindering rigorous assessment of large language models’ (LLMs) ability to leverage factual knowledge in UAQ scenarios. Method: We introduce UAQFact—the first bilingual UAQ benchmark augmented with auxiliary factual knowledge, automatically constructed from knowledge graphs and covering both implicit and explicit reasoning paths. We propose a dual-task evaluation paradigm to separately measure models’ capability to retrieve internal parametric knowledge and integrate externally injected factual knowledge. Results: Experiments reveal substantial performance degradation across mainstream LLMs (e.g., Llama, Qwen, GLM) on UAQFact; external knowledge injection yields only marginal gains; and prevalent phenomena—including factual neglect and spurious associations—indicate fundamental deficiencies in LLMs’ factual knowledge utilization mechanisms.

Technology Category

Application Category

📝 Abstract
Handling unanswerable questions (UAQ) is crucial for LLMs, as it helps prevent misleading responses in complex situations. While previous studies have built several datasets to assess LLMs' performance on UAQ, these datasets lack factual knowledge support, which limits the evaluation of LLMs' ability to utilize their factual knowledge when handling UAQ. To address the limitation, we introduce a new unanswerable question dataset UAQFact, a bilingual dataset with auxiliary factual knowledge created from a Knowledge Graph. Based on UAQFact, we further define two new tasks to measure LLMs' ability to utilize internal and external factual knowledge, respectively. Our experimental results across multiple LLM series show that UAQFact presents significant challenges, as LLMs do not consistently perform well even when they have factual knowledge stored. Additionally, we find that incorporating external knowledge may enhance performance, but LLMs still cannot make full use of the knowledge which may result in incorrect responses.
Problem

Research questions and friction points this paper is trying to address.

Evaluating LLMs' ability to handle unanswerable questions with factual knowledge
Addressing lack of factual knowledge support in existing UAQ datasets
Assessing LLMs' utilization of internal and external knowledge for UAQ
Innovation

Methods, ideas, or system contributions that make the work stand out.

Introduces UAQFact dataset with Knowledge Graph
Defines tasks for internal and external knowledge
Evaluates LLMs' factual knowledge utilization challenges
🔎 Similar Papers
No similar papers found.
C
Chuanyuan Tan
School of Computer Science and Technology, Soochow University, China
W
Wenbiao Shao
School of Computer Science and Technology, Soochow University, China
H
Hao Xiong
School of Computer Science and Technology, Soochow University, China
T
Tong Zhu
School of Computer Science and Technology, Soochow University, China
Z
Zhenhua Liu
School of Computer Science and Technology, Soochow University, China
Kai Shi
Kai Shi
Microsoft
Fiber OpticsSemiconductor LasersOptical Communication Systems
W
Wenliang Chen
School of Computer Science and Technology, Soochow University, China