Amortized Bayesian Meta-Learning for Low-Rank Adaptation of Large Language Models

📅 2025-08-19

📈 Citations: 0

✨ Influential: 0

career value

218K/year

🤖 AI Summary

Low-rank adaptation (LoRA) of large language models (LLMs) suffers from poor cross-task generalization, difficulty in quantifying uncertainty, and prohibitive computational overhead in existing Bayesian approaches. Method: We propose an efficient amortized Bayesian meta-learning framework tailored for LoRA, integrating variational inference with parameter reconstruction within the meta-learning pipeline. A learnable hyperparameter balances reconstruction fidelity and parameter preservation, eliminating second-order gradients and long-context prompting. Contribution/Results: Evaluated on Unified-QA and CrossFit benchmarks, our method significantly improves accuracy and reduces expected calibration error (ECE). It maintains low memory and computational costs on large models such as Llama3-8B, achieving— for the first time—scalable, well-calibrated, and highly generalizable Bayesian fine-tuning under LoRA adaptation.

Technology Category

Application Category

📝 Abstract

Fine-tuning large language models (LLMs) with low-rank adaptaion (LoRA) is a cost-effective way to incorporate information from a specific dataset. However, it is often unclear how well the fine-tuned LLM will generalize, i.e., how well it will perform on unseen datasets. Methods have been proposed to improve generalization by optimizing with in-context prompts, or by using meta-learning to fine-tune LLMs. However, these methods are expensive in memory and computation, requiring either long-context prompts or saving copies of parameters and using second-order gradient updates. To address these challenges, we propose Amortized Bayesian Meta-Learning for LoRA (ABMLL). This method builds on amortized Bayesian meta-learning for smaller models, adapting this approach to LLMs while maintaining its computational efficiency. We reframe task-specific and global parameters in the context of LoRA and use a set of new hyperparameters to balance reconstruction accuracy and the fidelity of task-specific parameters to the global ones. ABMLL provides effective generalization and scales to large models such as Llama3-8B. Furthermore, as a result of using a Bayesian framework, ABMLL provides improved uncertainty quantification. We test ABMLL on Unified-QA and CrossFit datasets and find that it outperforms existing methods on these benchmarks in terms of both accuracy and expected calibration error.

Problem

Research questions and friction points this paper is trying to address.

Improving generalization of LoRA fine-tuned LLMs on unseen datasets

Reducing computational and memory costs of meta-learning methods

Enhancing uncertainty quantification in LLM fine-tuning processes

Innovation

Methods, ideas, or system contributions that make the work stand out.

Amortized Bayesian meta-learning for LoRA adaptation

Hyperparameters balance reconstruction and parameter fidelity

Efficient scaling to large models with uncertainty quantification

🔎 Similar Papers

No similar papers found.

💼 Related Jobs

PhD GenAI Research Scientist Intern

Databricks

SF Bay Area Hourly Rate$54—$60 USD

San Francisco, CA, USA

Research Engineer, Monetization AI