Minimal Ranks, Maximum Confidence: Parameter-efficient Uncertainty Quantification for LoRA

📅 2025-02-17

📈 Citations: 0

✨ Influential: 0

career value

177K/year

🤖 AI Summary

Standard LoRA is computationally efficient but lacks uncertainty quantification, resulting in poor calibration. While Bayesian LoRA variants model uncertainty, they incur substantial parameter overhead and training instability. This paper proposes Low-Rank Bayesian LoRA (LR-BLoRA), the first method to theoretically establish that the covariance of LoRA weight updates is inherently low-rank. Leveraging this insight, LR-BLoRA constructs a lightweight variational inference framework in a reduced-dimensional parameter space. It introduces fewer than 5% additional parameters yet achieves calibration performance on par with full-parameter Bayesian LoRA—reducing Expected Calibration Error (ECE) by over 40%—while simultaneously enhancing training stability and accelerating convergence. Extensive experiments across diverse natural language understanding and generative tasks demonstrate LR-BLoRA’s unified gains in efficiency, robustness, and calibration.

Technology Category

Application Category

📝 Abstract

Low-Rank Adaptation (LoRA) enables parameter-efficient fine-tuning of large language models by decomposing weight updates into low-rank matrices, significantly reducing storage and computational overhead. While effective, standard LoRA lacks mechanisms for uncertainty quantification, leading to overconfident and poorly calibrated models. Bayesian variants of LoRA address this limitation, but at the cost of a significantly increased number of trainable parameters, partially offsetting the original efficiency gains. Additionally, these models are harder to train and may suffer from unstable convergence. In this work, we propose a novel parameter-efficient Bayesian LoRA, demonstrating that effective uncertainty quantification can be achieved in very low-dimensional parameter spaces. The proposed method achieves strong performance with improved calibration and generalization while maintaining computational efficiency. Our empirical findings show that, with the appropriate projection of the weight space: (1) uncertainty can be effectively modeled in a low-dimensional space, and (2) weight covariances exhibit low ranks.

Problem

Research questions and friction points this paper is trying to address.

Efficient uncertainty quantification in LoRA

Maintains low-rank adaptation efficiency

Improves model calibration and generalization

Innovation

Methods, ideas, or system contributions that make the work stand out.

Parameter-efficient Bayesian LoRA

Low-dimensional uncertainty modeling

Improved calibration and generalization

🔎 Similar Papers

LoRA-Ensemble: Efficient Uncertainty Modelling for Self-attention Networks