๐ค AI Summary
To address the prohibitive computational and memory overhead of full-parameter fine-tuning for large language models (LLMs), this paper proposes Sparse Random Adapter (SRA): a parameter-efficient fine-tuning (PEFT) method that randomly freezes the vast majority of model parameters and applies standard backpropagation updates to only a tiny fraction (e.g., 0.1%โ0.5%) of weightsโwithout low-rank decomposition, adapter modules, or auxiliary parameters. Crucially, we identify sparsity itself as the core mechanism enabling efficient PEFT. Extensive experiments demonstrate that SRA matches or surpasses LoRA in performance across diverse alignment and downstream tasks, while reducing GPU memory consumption by 40% and FLOPs by 35%. To our knowledge, this is the first work to rigorously validate purely random sparse weight updates as a lightweight, general-purpose, and highly effective PEFT paradigm.
๐ Abstract
Full fine-tuning of large language models for alignment and task adaptation has become prohibitively expensive as models have grown in size. Parameter-Efficient Fine-Tuning (PEFT) methods aim at significantly reducing the computational and memory resources needed for fine-tuning these models by only training on a small number of parameters instead of all model parameters. Currently, the most popular PEFT method is the Low-Rank Adaptation (LoRA), which freezes the parameters of the model to be fine-tuned and introduces a small set of trainable parameters in the form of low-rank matrices. We propose simply reducing the number of trainable parameters by randomly selecting a small proportion of the model parameters to train on. In this paper, we compare the efficiency and performance of our proposed approach with PEFT methods, including LoRA, as well as full parameter fine-tuning.