🤖 AI Summary
Existing large language models (LLMs) commonly rely on manually tuned hyperparameters for token sampling, resulting in complex deployment and poor generalization. To address this, we propose Entropy-Equilibrium Sampling (EES), the first dynamic sampling mechanism that operates **without auxiliary hyperparameters**. Grounded in information theory, EES jointly models normalized entropy and probability mass to adaptively construct candidate token sets, thereby simultaneously improving accuracy, coherence, and diversity across varying temperature settings. EES is architecture-agnostic and integrates seamlessly with mainstream LLMs. Empirical evaluation across diverse reasoning and text generation benchmarks demonstrates consistent and significant improvements over standard sampling methods—including Top-k and Nucleus sampling—across multiple models and temperature configurations. Crucially, EES maintains robust performance without task-specific tuning, substantially simplifying deployment and enhancing generation reliability.
📝 Abstract
Token sampling strategies critically influence text generation quality in large language models (LLMs). However, existing methods introduce additional hyperparameters, requiring extensive tuning and complicating deployment. We present Entropy Equilibrium Sampling (EES), an auxiliary hyperparameter-free approach inspired by information theory that can dynamically adjust candidate sets by balancing normalized entropy with probability mass. We evaluate EES on both reasoning and generation tasks across a range of model architectures. Our results show that EES consistently performs well across temperature settings, delivering competitive accuracy and coherence while maintaining diversity. By eliminating the need for hyperparameter tuning, EES greatly simplifies deployment while improving performance. Code is available at https://github.com/shuanncai/EES