Monte Carlo Temperature: a robust sampling strategy for LLM's uncertainty quantification methods

📅 2025-02-25
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Uncertainty quantification (UQ) for large language models (LLMs) commonly relies on repeated sampling with a fixed temperature, yet temperature selection critically affects UQ quality and necessitates costly, model- and dataset-specific hyperparameter optimization (HPO). Method: We propose Monte Carlo Temperature Sampling (MCTS), a novel strategy that randomly samples temperatures from a predefined interval to generate outputs—eliminating the need for temperature calibration. Contribution/Results: MCTS is the first method to systematically characterize the fundamental impact of temperature on UQ quality, establishing a plug-and-play UQ paradigm that is temperature-robust across diverse settings and matches the statistical performance of oracle-optimal temperature selection. Evaluated across multiple LLMs and diverse NLP tasks, MCTS significantly improves calibration and reliability while completely removing HPO overhead and maintaining stability over broad temperature ranges.

Technology Category

Application Category

📝 Abstract
Uncertainty quantification (UQ) in Large Language Models (LLMs) is essential for their safe and reliable deployment, particularly in critical applications where incorrect outputs can have serious consequences. Current UQ methods typically rely on querying the model multiple times using non-zero temperature sampling to generate diverse outputs for uncertainty estimation. However, the impact of selecting a given temperature parameter is understudied, and our analysis reveals that temperature plays a fundamental role in the quality of uncertainty estimates. The conventional approach of identifying optimal temperature values requires expensive hyperparameter optimization (HPO) that must be repeated for each new model-dataset combination. We propose Monte Carlo Temperature (MCT), a robust sampling strategy that eliminates the need for temperature calibration. Our analysis reveals that: 1) MCT provides more robust uncertainty estimates across a wide range of temperatures, 2) MCT improves the performance of UQ methods by replacing fixed-temperature strategies that do not rely on HPO, and 3) MCT achieves statistical parity with oracle temperatures, which represent the ideal outcome of a well-tuned but computationally expensive HPO process. These findings demonstrate that effective UQ can be achieved without the computational burden of temperature parameter calibration.
Problem

Research questions and friction points this paper is trying to address.

Uncertainty quantification in LLMs
Impact of temperature parameter selection
Eliminating need for temperature calibration
Innovation

Methods, ideas, or system contributions that make the work stand out.

Monte Carlo Temperature for uncertainty
Eliminates need for temperature calibration
Improves performance without hyperparameter optimization
🔎 Similar Papers
No similar papers found.
N
Nicola Cecere
Amazon
A
Andrea Bacciu
Amazon
I
Ignacio Fernández Tobías
Amazon
Amin Mantrach
Amin Mantrach
Amazon
Machine Learning