Improving Diversity in Language Models: When Temperature Fails, Change the Loss

📅 2025-08-13

📈 Citations: 0

✨ Influential: 0

career value

159K/year

🤖 AI Summary

Language models often suffer from insufficient generation diversity, and merely increasing decoding temperature fails to improve recall (coverage) effectively. Method: This paper proposes a coverage-oriented training paradigm that explicitly incorporates the precision-recall (P-R) framework into the training objective. Specifically, it reformulates the loss function by augmenting standard negative log-likelihood with an explicit term balancing coverage (recall) and precision relative to the target distribution. Contribution/Results: The method enables models to acquire high-coverage capability during training, thereby substantially enhancing the efficacy of temperature-based sampling. Experiments across multiple benchmarks demonstrate Pareto-optimal improvements in the P-R trade-off: recall increases by up to 12.3% without sacrificing generation quality—outperforming conventional temperature scaling and state-of-the-art diversity-regularization approaches.

Technology Category

Application Category

📝 Abstract

Increasing diversity in language models is a challenging yet essential objective. A common approach is to raise the decoding temperature. In this work, we investigate this approach through a simplistic yet common case to provide insights into why decreasing temperature can improve quality (Precision), while increasing it often fails to boost coverage (Recall). Our analysis reveals that for a model to be effectively tunable through temperature adjustments, it must be trained toward coverage. To address this, we propose rethinking loss functions in language models by leveraging the Precision-Recall framework. Our results demonstrate that this approach achieves a substantially better trade-off between Precision and Recall than merely combining negative log-likelihood training with temperature scaling. These findings offer a pathway toward more versatile and robust language modeling techniques.

Problem

Research questions and friction points this paper is trying to address.

Investigates why temperature scaling fails to improve diversity in language models

Proposes rethinking loss functions using Precision-Recall framework

Aims for better Precision-Recall trade-off in language modeling

Innovation

Methods, ideas, or system contributions that make the work stand out.

Rethinking loss functions using Precision-Recall framework

Training models toward coverage for temperature tunability

Better Precision-Recall trade-off than temperature scaling

🔎 Similar Papers

No similar papers found.