Are Hallucinations Bad Estimations?

📅 2025-09-25
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This paper addresses hallucination in generative models—outputs that lack any plausible real-world causal explanation. It argues that hallucination is not due to model incapacity, but rather stems from a structural misalignment between the loss-minimization objective and human-acceptable outputs, manifesting as systematic miscalibration-induced estimation error. Method: The authors formally define hallucination as “an estimate unattributable to any plausible causal source” and derive a high-probability lower bound on the hallucination rate under general data distributions. They empirically validate their theory across three diverse tasks: coin-aggregation, open-domain question answering, and text-to-image generation. Contribution/Results: Their analysis demonstrates that hallucination persists even under optimal loss minimization, establishing its intrinsic inevitability. This work provides a novel theoretical framework for understanding the root causes of hallucination and rigorously characterizes its fundamental limits in generative modeling.

Technology Category

Application Category

📝 Abstract
We formalize hallucinations in generative models as failures to link an estimate to any plausible cause. Under this interpretation, we show that even loss-minimizing optimal estimators still hallucinate. We confirm this with a general high probability lower bound on hallucinate rate for generic data distributions. This reframes hallucination as structural misalignment between loss minimization and human-acceptable outputs, and hence estimation errors induced by miscalibration. Experiments on coin aggregation, open-ended QA, and text-to-image support our theory.
Problem

Research questions and friction points this paper is trying to address.

Hallucinations are defined as ungrounded generative model outputs
Optimal estimators still produce hallucinations despite loss minimization
Hallucinations stem from misalignment between loss functions and human expectations
Innovation

Methods, ideas, or system contributions that make the work stand out.

Formalizes hallucinations as estimation-causation failures
Proves optimal estimators still hallucinate under loss minimization
Reframes hallucinations as structural misalignment with human expectations
🔎 Similar Papers
No similar papers found.
H
Hude Liu
Center for Foundation Models and Generative AI, Northwestern University, Evanston, IL 60208, USA; Department of Computer Science, Northwestern University, Evanston, IL 60208, USA
Jerry Yao-Chieh Hu
Jerry Yao-Chieh Hu
Northwestern University
Machine Learning(* denotes equal contribution)
J
Jennifer Yuntong Zhang
Engineering Science, University of Toronto, Toronto, ON M5S 1A4, CA
Z
Zhao Song
University of California, Berkeley, Berkeley, CA 94720, USA
H
Han Liu
Department of Statistics and Data Science, Northwestern University, Evanston, IL 60208, USA