Min-$k$ Sampling: Decoupling Truncation from Temperature Scaling via Relative Logit Dynamics

📅 2026-04-13
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Existing text generation sampling methods are highly sensitive to the temperature parameter, making it difficult to achieve a stable balance between diversity and accuracy. This work proposes Min-$k$ sampling, which dynamically identifies the “semantic cliff”—the boundary between high-confidence core tokens and uncertain long-tail tokens—by analyzing the local shape of the sorted logit distribution. The method adaptively determines the truncation threshold at each decoding step using a position-weighted relative decay rate, thereby achieving strict temperature invariance without relying on global statistics. Min-$k$ precisely captures the fine-grained confidence structure among candidate tokens. Experimental results demonstrate that Min-$k$ significantly outperforms mainstream approaches such as Top-$k$, Top-$p$, and Min-$p$ across reasoning tasks, creative writing, and human evaluations, maintaining robust performance even under extreme temperature settings.

Technology Category

Application Category

📝 Abstract
The quality of text generated by large language models depends critically on the decoding sampling strategy. While mainstream methods such as Top-$k$, Top-$p$, and Min-$p$ achieve a balance between diversity and accuracy through probability-space truncation, they share an inherent limitation: extreme sensitivity to the temperature parameter. Recent logit-space approaches like Top-$nσ$ achieve temperature invariance but rely on global statistics that are susceptible to long-tail noise, failing to capture fine-grained confidence structures among top candidates. We propose \textbf{Min-$k$ Sampling}, a novel dynamic truncation strategy that analyzes the local shape of the sorted logit distribution to identify "semantic cliffs": sharp transitions from high-confidence core tokens to uncertain long-tail tokens. By computing a position-weighted relative decay rate, Min-$k$ dynamically determines truncation boundaries at each generation step. We formally prove that Min-$k$ achieves strict temperature invariance and empirically demonstrate its low sensitivity to hyperparameter choices. Experiments on multiple reasoning benchmarks, creative writing tasks, and human evaluation show that Min-$k$ consistently improves text quality, maintaining robust performance even under extreme temperature settings where probability-based methods collapse. We make our code, models, and analysis tools publicly available.
Problem

Research questions and friction points this paper is trying to address.

temperature sensitivity
sampling strategy
logit dynamics
truncation
text generation
Innovation

Methods, ideas, or system contributions that make the work stand out.

Min-k Sampling
temperature invariance
logit-space truncation
semantic cliffs
dynamic decoding