Improving Diversity in Language Models: When Temperature Fails, Change the Loss

📅 2025-08-13
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Language models often suffer from insufficient generation diversity, and merely increasing decoding temperature fails to improve recall (coverage) effectively. Method: This paper proposes a coverage-oriented training paradigm that explicitly incorporates the precision-recall (P-R) framework into the training objective. Specifically, it reformulates the loss function by augmenting standard negative log-likelihood with an explicit term balancing coverage (recall) and precision relative to the target distribution. Contribution/Results: The method enables models to acquire high-coverage capability during training, thereby substantially enhancing the efficacy of temperature-based sampling. Experiments across multiple benchmarks demonstrate Pareto-optimal improvements in the P-R trade-off: recall increases by up to 12.3% without sacrificing generation quality—outperforming conventional temperature scaling and state-of-the-art diversity-regularization approaches.

Technology Category

Application Category

📝 Abstract
Increasing diversity in language models is a challenging yet essential objective. A common approach is to raise the decoding temperature. In this work, we investigate this approach through a simplistic yet common case to provide insights into why decreasing temperature can improve quality (Precision), while increasing it often fails to boost coverage (Recall). Our analysis reveals that for a model to be effectively tunable through temperature adjustments, it must be trained toward coverage. To address this, we propose rethinking loss functions in language models by leveraging the Precision-Recall framework. Our results demonstrate that this approach achieves a substantially better trade-off between Precision and Recall than merely combining negative log-likelihood training with temperature scaling. These findings offer a pathway toward more versatile and robust language modeling techniques.
Problem

Research questions and friction points this paper is trying to address.

Investigates why temperature scaling fails to improve diversity in language models
Proposes rethinking loss functions using Precision-Recall framework
Aims for better Precision-Recall trade-off in language modeling
Innovation

Methods, ideas, or system contributions that make the work stand out.

Rethinking loss functions using Precision-Recall framework
Training models toward coverage for temperature tunability
Better Precision-Recall trade-off than temperature scaling
🔎 Similar Papers
No similar papers found.