🤖 AI Summary
To address the challenges of limited computational and data resources and low parameter efficiency in large model fine-tuning, this paper proposes Probabilistic Vector-driven Random Adapters (P-VeRA). Building upon a frozen backbone, P-VeRA replaces the deterministic low-rank matrices in VeRA with stochastic matrices modeled via probabilistic inference, enabling diverse sampling during both training and inference and explicitly capturing input uncertainty. By optimizing only a small set of vector parameters, P-VeRA achieves efficient task adaptation. On the VTAB-1k benchmark, P-VeRA significantly outperforms VeRA and seven state-of-the-art adapter methods—despite using substantially fewer trainable parameters—thereby achieving a superior trade-off between parameter efficiency and generalization performance.
📝 Abstract
Large foundation models have emerged in the last years and are pushing performance boundaries for a variety of tasks. Training or even finetuning such models demands vast datasets and computational resources, which are often scarce and costly. Adaptation methods provide a computationally efficient solution to address these limitations by allowing such models to be finetuned on small amounts of data and computing power. This is achieved by appending new trainable modules to frozen backbones with only a fraction of the trainable parameters and fitting only these modules on novel tasks. Recently, the VeRA adapter was shown to excel in parameter-efficient adaptations by utilizing a pair of frozen random low-rank matrices shared across all layers. In this paper, we propose PVeRA, a probabilistic version of the VeRA adapter, which modifies the low-rank matrices of VeRA in a probabilistic manner. This modification naturally allows handling inherent ambiguities in the input and allows for different sampling configurations during training and testing. A comprehensive evaluation was performed on the VTAB-1k benchmark and seven adapters, with PVeRA outperforming VeRA and other adapters. Our code for training models with PVeRA and benchmarking all adapters is available https://github.com/leofillioux/pvera.