🤖 AI Summary
This work addresses the non-ideal behaviors—hallucination, text degeneration, and sequence repetition—exhibited by large language models (LLMs) under cognitive uncertainty, unifying them as uncertainty-driven systemic degradation. Through controlled-variable experiments across model families, uncertainty quantification, and comparative analysis of multi-decoding strategies, we empirically establish, for the first time, a consistent degradation hierarchy across model scales, training budgets, and alignment methods: stronger models hallucinate first, then produce degenerate text, and finally repeat—reversing dynamically within single-sequence generation as uncertainty evolves, thereby challenging the conventional “degradation-as-failure” attribution. Furthermore, we reveal that decoding optimizations suppressing repetition may inadvertently amplify latent hallucination, introducing a novel evaluation dimension and intervention perspective for trustworthy LLM generation.
📝 Abstract
Large language models (LLMs) often exhibit undesirable behaviors, such as hallucinations and sequence repetitions. We propose to view these behaviors as fallbacks that models exhibit under epistemic uncertainty, and investigate the connection between them. We categorize fallback behaviors - sequence repetitions, degenerate text, and hallucinations - and extensively analyze them in models from the same family that differ by the amount of pretraining tokens, parameter count, or the inclusion of instruction-following training. Our experiments reveal a clear and consistent ordering of fallback behaviors, across all these axes: the more advanced an LLM is (i.e., trained on more tokens, has more parameters, or instruction-tuned), its fallback behavior shifts from sequence repetitions, to degenerate text, and then to hallucinations. Moreover, the same ordering is observed during the generation of a single sequence, even for the best-performing models; as uncertainty increases, models shift from generating hallucinations to producing degenerate text and finally sequence repetitions. Lastly, we demonstrate that while common decoding techniques, such as random sampling, alleviate unwanted behaviors like sequence repetitions, they increase harder-to-detect hallucinations.