Efficient Uncertainty in LLMs through Evidential Knowledge Distillation

📅 2025-07-24

📈 Citations: 0

✨ Influential: 0

career value

158K/year

🤖 AI Summary

Uncertainty quantification in large language models (LLMs) typically relies on computationally expensive Bayesian or ensemble methods requiring multiple forward passes. This work proposes an efficient distillation framework that—novelty—integrates evidential learning with knowledge distillation, enabling lightweight student models to produce robust uncertainty estimates via a single forward pass. The student is fine-tuned using LoRA, and two distillation strategies are comparatively evaluated: soft-label distillation versus Dirichlet-distribution-based evidential distillation, on classification tasks. Empirical results demonstrate that the distilled student achieves predictive accuracy and uncertainty calibration on par with—or even surpassing—that of the multi-sample teacher, while drastically reducing inference latency. The approach thus attains an optimal trade-off among accuracy, calibration fidelity, and computational efficiency, offering a scalable solution for uncertainty-aware deployment of LLMs.

Technology Category

Application Category

📝 Abstract

Accurate uncertainty quantification remains a key challenge for standard LLMs, prompting the adoption of Bayesian and ensemble-based methods. However, such methods typically necessitate computationally expensive sampling, involving multiple forward passes to effectively estimate predictive uncertainty. In this paper, we introduce a novel approach enabling efficient and effective uncertainty estimation in LLMs without sacrificing performance. Specifically, we distill uncertainty-aware teacher models - originally requiring multiple forward passes - into compact student models sharing the same architecture but fine-tuned using Low-Rank Adaptation (LoRA). We compare two distinct distillation strategies: one in which the student employs traditional softmax-based outputs, and another in which the student leverages Dirichlet-distributed outputs to explicitly model epistemic uncertainty via evidential learning. Empirical evaluations on classification datasets demonstrate that such students can achieve comparable or superior predictive and uncertainty quantification performance relative to their teacher models, while critically requiring only a single forward pass. To our knowledge, this is the first demonstration that immediate and robust uncertainty quantification can be achieved in LLMs through evidential distillation.

Problem

Research questions and friction points this paper is trying to address.

Efficient uncertainty estimation in LLMs without performance loss

Distilling multi-pass teacher models into single-pass students

Comparing softmax and Dirichlet outputs for uncertainty modeling

Innovation

Methods, ideas, or system contributions that make the work stand out.

Distilling uncertainty-aware teacher models into compact students

Using LoRA for fine-tuning student models efficiently

Employing Dirichlet outputs for explicit epistemic uncertainty modeling

🔎 Similar Papers

No similar papers found.

💼 Related Jobs

Machine Learning Engineer, PhD Intern

Instacart

CA, NY, CT, NJ$50—$50 USDWA$47.50—$47.50 USDOR, DE, ME, MA, MD, NH, RI, VT, DC, PA, VA, CO, TX, IL, HI$44—$44 USDAll other states$42—$42 USD

remote

Authors to Follow