Parametric Knowledge is Not All You Need: Toward Honest Large Language Models via Retrieval of Pretraining Data

📅 2026-01-29
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Large language models often generate hallucinations when lacking relevant knowledge, rather than honestly responding with “I don’t know.” This work presents the first honesty evaluation benchmark that leverages publicly available pretraining corpora—specifically, the Pythia model and its complete training data—to delineate a model’s true knowledge boundaries. The authors propose a retrieval-augmented approach that integrates pretraining data retrieval with knowledge boundary identification. This method significantly enhances the model’s ability to respond honestly to unknown questions, reducing hallucinations while improving the accuracy of “I don’t know” responses. The study offers a novel pathway toward building more reliable and trustworthy language models.

Technology Category

Application Category

📝 Abstract
Large language models (LLMs) are highly capable of answering questions, but they are often unaware of their own knowledge boundary, i.e., knowing what they know and what they don't know. As a result, they can generate factually incorrect responses on topics they do not have enough knowledge of, commonly known as hallucination. Rather than hallucinating, a language model should be more honest and respond with"I don't know"when it does not have enough knowledge about a topic. Many methods have been proposed to improve LLM honesty, but their evaluations lack robustness, as they do not take into account the knowledge that the LLM has ingested during its pretraining. In this paper, we propose a more robust evaluation benchmark dataset for LLM honesty by utilizing Pythia, a truly open LLM with publicly available pretraining data. In addition, we also propose a novel method for harnessing the pretraining data to build a more honest LLM.
Problem

Research questions and friction points this paper is trying to address.

hallucination
honesty
large language models
knowledge boundary
pretraining data
Innovation

Methods, ideas, or system contributions that make the work stand out.

honest language models
pretraining data retrieval
hallucination mitigation
knowledge boundary awareness
Pythia benchmark
🔎 Similar Papers
No similar papers found.
C
Christopher Adrian Kusuma
Department of Computer Science, National University of Singapore
M
Muhammad Reza Qorib
Department of Computer Science, National University of Singapore
Hwee Tou Ng
Hwee Tou Ng
Provost's Chair Professor of Computer Science, National University of Singapore
Natural Language ProcessingComputational Linguistics