🤖 AI Summary
Existing text generation sampling methods are highly sensitive to the temperature parameter, making it difficult to achieve a stable balance between diversity and accuracy. This work proposes Min-$k$ sampling, which dynamically identifies the “semantic cliff”—the boundary between high-confidence core tokens and uncertain long-tail tokens—by analyzing the local shape of the sorted logit distribution. The method adaptively determines the truncation threshold at each decoding step using a position-weighted relative decay rate, thereby achieving strict temperature invariance without relying on global statistics. Min-$k$ precisely captures the fine-grained confidence structure among candidate tokens. Experimental results demonstrate that Min-$k$ significantly outperforms mainstream approaches such as Top-$k$, Top-$p$, and Min-$p$ across reasoning tasks, creative writing, and human evaluations, maintaining robust performance even under extreme temperature settings.
📝 Abstract
The quality of text generated by large language models depends critically on the decoding sampling strategy. While mainstream methods such as Top-$k$, Top-$p$, and Min-$p$ achieve a balance between diversity and accuracy through probability-space truncation, they share an inherent limitation: extreme sensitivity to the temperature parameter. Recent logit-space approaches like Top-$nσ$ achieve temperature invariance but rely on global statistics that are susceptible to long-tail noise, failing to capture fine-grained confidence structures among top candidates. We propose \textbf{Min-$k$ Sampling}, a novel dynamic truncation strategy that analyzes the local shape of the sorted logit distribution to identify "semantic cliffs": sharp transitions from high-confidence core tokens to uncertain long-tail tokens. By computing a position-weighted relative decay rate, Min-$k$ dynamically determines truncation boundaries at each generation step. We formally prove that Min-$k$ achieves strict temperature invariance and empirically demonstrate its low sensitivity to hyperparameter choices. Experiments on multiple reasoning benchmarks, creative writing tasks, and human evaluation show that Min-$k$ consistently improves text quality, maintaining robust performance even under extreme temperature settings where probability-based methods collapse. We make our code, models, and analysis tools publicly available.