🤖 AI Summary
This work addresses the challenge of deploying large language models (LLMs) in high-stakes settings, where insufficient reliability remains a critical barrier. The core issue lies in effectively leveraging uncertainty to enhance controllability and robustness. To this end, the paper proposes a unified design paradigm that transforms uncertainty from a passive diagnostic metric into an active control signal driving reasoning, agent decision-making, and reinforcement learning. By integrating Bayesian methods with conformal prediction, this framework enables self-correction in advanced reasoning, adaptive tool invocation in autonomous agents, and intrinsic reward shaping in reinforcement learning. The study provides a systematic survey, critical analysis, and practical design guidelines for building scalable, reliable, and trustworthy next-generation AI systems grounded in principled uncertainty quantification.
📝 Abstract
While Large Language Models (LLMs) show remarkable capabilities, their unreliability remains a critical barrier to deployment in high-stakes domains. This survey charts a functional evolution in addressing this challenge: the evolution of uncertainty from a passive diagnostic metric to an active control signal guiding real-time model behavior. We demonstrate how uncertainty is leveraged as an active control signal across three frontiers: in \textbf{advanced reasoning} to optimize computation and trigger self-correction; in \textbf{autonomous agents} to govern metacognitive decisions about tool use and information seeking; and in \textbf{reinforcement learning} to mitigate reward hacking and enable self-improvement via intrinsic rewards. By grounding these advancements in emerging theoretical frameworks like Bayesian methods and Conformal Prediction, we provide a unified perspective on this transformative trend. This survey provides a comprehensive overview, critical analysis, and practical design patterns, arguing that mastering the new trend of uncertainty is essential for building the next generation of scalable, reliable, and trustworthy AI.