From Loops to Oops: Fallback Behaviors of Language Models Under Uncertainty

📅 2024-07-08

🏛️ arXiv.org

📈 Citations: 3

✨ Influential: 0

career value

193K/year

🤖 AI Summary

This work addresses the non-ideal behaviors—hallucination, text degeneration, and sequence repetition—exhibited by large language models (LLMs) under cognitive uncertainty, unifying them as uncertainty-driven systemic degradation. Through controlled-variable experiments across model families, uncertainty quantification, and comparative analysis of multi-decoding strategies, we empirically establish, for the first time, a consistent degradation hierarchy across model scales, training budgets, and alignment methods: stronger models hallucinate first, then produce degenerate text, and finally repeat—reversing dynamically within single-sequence generation as uncertainty evolves, thereby challenging the conventional “degradation-as-failure” attribution. Furthermore, we reveal that decoding optimizations suppressing repetition may inadvertently amplify latent hallucination, introducing a novel evaluation dimension and intervention perspective for trustworthy LLM generation.

Technology Category

Application Category

📝 Abstract

Large language models (LLMs) often exhibit undesirable behaviors, such as hallucinations and sequence repetitions. We propose to view these behaviors as fallbacks that models exhibit under epistemic uncertainty, and investigate the connection between them. We categorize fallback behaviors - sequence repetitions, degenerate text, and hallucinations - and extensively analyze them in models from the same family that differ by the amount of pretraining tokens, parameter count, or the inclusion of instruction-following training. Our experiments reveal a clear and consistent ordering of fallback behaviors, across all these axes: the more advanced an LLM is (i.e., trained on more tokens, has more parameters, or instruction-tuned), its fallback behavior shifts from sequence repetitions, to degenerate text, and then to hallucinations. Moreover, the same ordering is observed during the generation of a single sequence, even for the best-performing models; as uncertainty increases, models shift from generating hallucinations to producing degenerate text and finally sequence repetitions. Lastly, we demonstrate that while common decoding techniques, such as random sampling, alleviate unwanted behaviors like sequence repetitions, they increase harder-to-detect hallucinations.

Problem

Research questions and friction points this paper is trying to address.

Analyze fallback behaviors in language models

Link uncertainty to model fallback behaviors

Examine effects of decoding techniques on behaviors

Innovation

Methods, ideas, or system contributions that make the work stand out.

Categorizes fallback behaviors

Analyzes advanced LLM fallbacks

Evaluates decoding techniques impact

🔎 Similar Papers

Racing Thoughts: Explaining Large Language Model Contextualization Errors