Bayesian-LoRA: Probabilistic Low-Rank Adaptation of Large Language Models

📅 2026-01-28

📈 Citations: 0

✨ Influential: 0

career value

196K/year

🤖 AI Summary

This work addresses the poor calibration of large language models under small-data fine-tuning, often stemming from overconfidence. Inspired by sparse Gaussian processes (SGPs), the authors propose a Bayesian low-rank adaptation method that, for the first time, integrates Bayesian inference into the LoRA framework by revealing an isomorphism between LoRA’s factorization and the Kronecker-structured SGP posterior. This enables principled uncertainty quantification with minimal overhead—adding only approximately 0.42 million extra parameters and 1.2× training cost. Evaluated on 30B-scale models, the approach substantially improves calibration: expected calibration error (ECE) is reduced by up to 84% and negative log-likelihood (NLL) by 76%, while maintaining competitive accuracy.

Technology Category

Application Category

📝 Abstract

Large Language Models usually put more emphasis on accuracy and therefore, will guess even when not certain about the prediction, which is especially severe when fine-tuned on small datasets due to the inherent tendency toward miscalibration. In this work, we introduce Bayesian-LoRA, which reformulates the deterministic LoRA update as a probabilistic low-rank representation inspired by Sparse Gaussian Processes. We identify a structural isomorphism between LoRA's factorization and Kronecker-factored SGP posteriors, and show that LoRA emerges as a limiting case when posterior uncertainty collapses. We conduct extensive experiments on various LLM architectures across commonsense reasoning benchmarks. With only approximately 0.42M additional parameters and ${\approx}1.2{\times}$ training cost relative to standard LoRA, Bayesian-LoRA significantly improves calibration across models up to 30B, achieving up to 84% ECE reduction and 76% NLL reduction while maintaining competitive accuracy for both in-distribution and out-of-distribution (OoD) evaluations.

Problem

Research questions and friction points this paper is trying to address.

miscalibration

large language models

uncertainty quantification

low-rank adaptation

Bayesian inference

Innovation

Methods, ideas, or system contributions that make the work stand out.

Bayesian-LoRA

probabilistic low-rank adaptation

Sparse Gaussian Processes