🤖 AI Summary
This paper addresses hallucination in generative models—outputs that lack any plausible real-world causal explanation. It argues that hallucination is not due to model incapacity, but rather stems from a structural misalignment between the loss-minimization objective and human-acceptable outputs, manifesting as systematic miscalibration-induced estimation error. Method: The authors formally define hallucination as “an estimate unattributable to any plausible causal source” and derive a high-probability lower bound on the hallucination rate under general data distributions. They empirically validate their theory across three diverse tasks: coin-aggregation, open-domain question answering, and text-to-image generation. Contribution/Results: Their analysis demonstrates that hallucination persists even under optimal loss minimization, establishing its intrinsic inevitability. This work provides a novel theoretical framework for understanding the root causes of hallucination and rigorously characterizes its fundamental limits in generative modeling.
📝 Abstract
We formalize hallucinations in generative models as failures to link an estimate to any plausible cause. Under this interpretation, we show that even loss-minimizing optimal estimators still hallucinate. We confirm this with a general high probability lower bound on hallucinate rate for generic data distributions. This reframes hallucination as structural misalignment between loss minimization and human-acceptable outputs, and hence estimation errors induced by miscalibration. Experiments on coin aggregation, open-ended QA, and text-to-image support our theory.