🤖 AI Summary
This work addresses the poor calibration of large language models under small-data fine-tuning, often stemming from overconfidence. Inspired by sparse Gaussian processes (SGPs), the authors propose a Bayesian low-rank adaptation method that, for the first time, integrates Bayesian inference into the LoRA framework by revealing an isomorphism between LoRA’s factorization and the Kronecker-structured SGP posterior. This enables principled uncertainty quantification with minimal overhead—adding only approximately 0.42 million extra parameters and 1.2× training cost. Evaluated on 30B-scale models, the approach substantially improves calibration: expected calibration error (ECE) is reduced by up to 84% and negative log-likelihood (NLL) by 76%, while maintaining competitive accuracy.
📝 Abstract
Large Language Models usually put more emphasis on accuracy and therefore, will guess even when not certain about the prediction, which is especially severe when fine-tuned on small datasets due to the inherent tendency toward miscalibration. In this work, we introduce Bayesian-LoRA, which reformulates the deterministic LoRA update as a probabilistic low-rank representation inspired by Sparse Gaussian Processes. We identify a structural isomorphism between LoRA's factorization and Kronecker-factored SGP posteriors, and show that LoRA emerges as a limiting case when posterior uncertainty collapses. We conduct extensive experiments on various LLM architectures across commonsense reasoning benchmarks. With only approximately 0.42M additional parameters and ${\approx}1.2{\times}$ training cost relative to standard LoRA, Bayesian-LoRA significantly improves calibration across models up to 30B, achieving up to 84% ECE reduction and 76% NLL reduction while maintaining competitive accuracy for both in-distribution and out-of-distribution (OoD) evaluations.