BoRA: Towards More Expressive Low-Rank Adaptation with Block Diversity

📅 2025-08-09

📈 Citations: 0

✨ Influential: 0

🤖 AI Summary

Low-rank adaptation (LoRA) improves the efficiency of fine-tuning large language models (LLMs), yet increasing the rank ( r ) to boost performance incurs substantial parameter overhead. To address this, we propose Block-wise Rank-Augmented LoRA (BoRA), which introduces learnable block-diagonal scaling matrices atop standard low-rank decomposition. This enables inter-block diversity modeling, effectively elevating the representational rank while adding only a negligible number of parameters. BoRA integrates block-wise matrix multiplication with structured low-rank updates, preserving computational efficiency and inference compatibility. Extensive experiments across multiple benchmark datasets and mainstream LLMs—including LLaMA and Qwen—demonstrate that BoRA achieves average accuracy gains of 2.3–4.1 percentage points over standard LoRA, with less than 0.1% additional trainable parameters. Moreover, BoRA exhibits strong scalability and training stability, making it a practical and efficient enhancement to existing LoRA-based fine-tuning pipelines.

Technology Category

Application Category

📝 Abstract

Low-rank adaptation (LoRA) is a parameter-efficient fine-tuning (PEFT) method widely used in large language models (LLMs). It approximates the update of a pretrained weight matrix $Winmathbb{R}^{m imes n}$ by the product of two low-rank matrices, $BA$, where $A inmathbb{R}^{r imes n}$ and $Binmathbb{R}^{m imes r} (rllmin{m,n})$. Increasing the dimension $r$ can raise the rank of LoRA weights (i.e., $BA$), which typically improves fine-tuning performance but also significantly increases the number of trainable parameters. In this paper, we propose Block Diversified Low-Rank Adaptation (BoRA), which improves the rank of LoRA weights with a small number of additional parameters. Specifically, BoRA treats the product $BA$ as a block matrix multiplication, where $A$ and $B$ are partitioned into $b$ blocks along the columns and rows, respectively (i.e., $A=[A_1,dots,A_b]$ and $B=[B_1,dots,B_b]^ op$). Consequently, the product $BA$ becomes the concatenation of the block products $B_iA_j$ for $i,jin[b]$. To enhance the diversity of different block products, BoRA introduces a unique diagonal matrix $Σ_{i,j} in mathbb{R}^{r imes r}$ for each block multiplication, resulting in $B_i Σ_{i,j} A_j$. By leveraging these block-wise diagonal matrices, BoRA increases the rank of LoRA weights by a factor of $b$ while only requiring $b^2r$ additional parameters. Extensive experiments across multiple datasets and models demonstrate the superiority of BoRA, and ablation studies further validate its scalability.

Problem

Research questions and friction points this paper is trying to address.

Enhancing LoRA rank with minimal extra parameters

Improving fine-tuning performance via block diversity

Scaling low-rank adaptation efficiently in LLMs

Innovation

Methods, ideas, or system contributions that make the work stand out.

Block Diversified Low-Rank Adaptation (BoRA)

Partitioned block matrix multiplication

Diagonal matrices enhance block diversity

🔎 Similar Papers

No similar papers found.

Authors to Follow