Do We Know What LLMs Don't Know? A Study of Consistency in Knowledge Probing

📅 2025-05-27

📈 Citations: 0

✨ Influential: 0

career value

208K/year

🤖 AI Summary

This paper addresses the insufficient reliability of knowledge probing methods for large language models (LLMs), systematically exposing a “dual inconsistency”: intra-method inconsistency (e.g., answer option reordering reduces consistency by 40%) and cross-method inconsistency (as low as 7% agreement in knowledge attribution). To address this, we propose a rigorous evaluation framework based on input perturbation and quantitative consistency metrics, encompassing mainstream techniques including calibration-based and prompt-based probing. We provide the first empirical quantification and formal definition of dual inconsistency, demonstrating that existing probes are highly fragile to semantically irrelevant input perturbations and frequently yield contradictory judgments about whether an LLM possesses the same factual knowledge. Our findings challenge the foundational assumptions of current knowledge probing paradigms and establish both theoretical grounding and empirical evidence for developing robust, perturbation-invariant probing frameworks.

Technology Category

Application Category

📝 Abstract

The reliability of large language models (LLMs) is greatly compromised by their tendency to hallucinate, underscoring the need for precise identification of knowledge gaps within LLMs. Various methods for probing such gaps exist, ranging from calibration-based to prompting-based methods. To evaluate these probing methods, in this paper, we propose a new process based on using input variations and quantitative metrics. Through this, we expose two dimensions of inconsistency in knowledge gap probing. (1) Intra-method inconsistency: Minimal non-semantic perturbations in prompts lead to considerable variance in detected knowledge gaps within the same probing method; e.g., the simple variation of shuffling answer options can decrease agreement to around 40%. (2) Cross-method inconsistency: Probing methods contradict each other on whether a model knows the answer. Methods are highly inconsistent -- with decision consistency across methods being as low as 7% -- even though the model, dataset, and prompt are all the same. These findings challenge existing probing methods and highlight the urgent need for perturbation-robust probing frameworks.

Problem

Research questions and friction points this paper is trying to address.

Identify knowledge gaps in LLMs reliably

Assess inconsistency in probing methods for LLMs

Develop robust frameworks for knowledge gap detection

Innovation

Methods, ideas, or system contributions that make the work stand out.

Input variations for knowledge gap probing

Quantitative metrics to evaluate probing methods

Perturbation-robust probing frameworks needed

🔎 Similar Papers

No similar papers found.