Training-Free Bayesianization for Low-Rank Adapters of Large Language Models

📅 2024-12-07
🏛️ arXiv.org
📈 Citations: 1
Influential: 1
📄 PDF
🤖 AI Summary
Low-rank adapters (LoRA) for large language models (LLMs) lack inherent uncertainty quantification capabilities. Method: We propose Training-Free Bayesianization (TFB), a framework that transforms pretrained LoRA adapters into Bayesian counterparts without any additional training. TFB employs a low-rank isotropic Gaussian prior and, via rigorous variational inference analysis combined with maximum-acceptable-variance search, proves theoretical equivalence to KL-regularized variational optimization. Contribution/Results: TFB breaks the conventional reliance on fine-tuning or post-hoc calibration, enabling the first zero-overhead Bayesianization of parameter-efficient fine-tuning (PEFT) adapters. Empirically, it achieves state-of-the-art performance in uncertainty calibration, out-of-distribution detection, and robust generalization across diverse LLMs and tasks. The implementation is publicly available.

Technology Category

Application Category

📝 Abstract
Estimating the uncertainty of responses from Large Language Models (LLMs) remains a critical challenge. While recent Bayesian methods have demonstrated effectiveness in quantifying uncertainty through low-rank weight updates, they typically require complex fine-tuning or post-training procedures. In this paper, we propose Training-Free Bayesianization (TFB), a simple yet theoretically grounded framework that efficiently transforms trained low-rank adapters into Bayesian ones without additional training. TFB systematically searches for the maximally acceptable level of variance in the weight posterior, constrained within a family of low-rank isotropic Gaussian distributions. Our theoretical analysis shows that under mild conditions, this search process is equivalent to KL-regularized variational optimization, a generalized form of variational inference. Through comprehensive experiments, we show that TFB achieves superior uncertainty estimation and generalization compared to existing methods while eliminating the need for complex Bayesianization training procedures. Code will be available at https://github.com/Wang-ML-Lab/bayesian-peft.
Problem

Research questions and friction points this paper is trying to address.

Estimating uncertainty in Large Language Models responses
Avoiding complex fine-tuning for Bayesian low-rank adapters
Transforming trained adapters into Bayesian ones without training
Innovation

Methods, ideas, or system contributions that make the work stand out.

Training-free Bayesianization for low-rank adapters
Maximizes variance in isotropic Gaussian distributions
Equivalent to KL-regularized variational optimization
🔎 Similar Papers
No similar papers found.