ABM-LoRA: Activation Boundary Matching for Fast Convergence in Low-Rank Adaptation

📅 2025-11-24

📈 Citations: 0

✨ Influential: 0

career value

185K/year

🤖 AI Summary

LoRA’s random initialization restricts gradient updates to a tangent subspace misaligned with the pretrained model’s activation distribution, causing early information loss and slow convergence. To address this, we propose Activation Boundary Matching (ABM), a novel initialization strategy that aligns the activation boundaries of LoRA adapters with those of the backbone model prior to downstream fine-tuning. ABM maximizes the projection of full-parameter gradients onto the low-rank subspace, thereby significantly reducing initial optimization bias. To our knowledge, this is the first work to incorporate activation boundary alignment into LoRA initialization. The method is architecture-agnostic, validated on T5, LLaMA2, and ViT. Empirical results demonstrate accelerated convergence across GLUE, WizardLM, and VTAB-1K benchmarks; on VTAB-1K, ABM yields a +2.1% average accuracy gain, with particularly pronounced improvements on geometric reasoning tasks—confirming its parameter efficiency and strong generalization capability.

Technology Category

Application Category

📝 Abstract

We propose Activation Boundary Matching for Low-Rank Adaptation (ABM-LoRA), a principled initialization strategy that substantially accelerates the convergence of low-rank adapters. While LoRA offers high parameter efficiency, its random initialization restricts gradient updates to a mismatched tangent space, causing significant information loss and hindering early convergence. Our ABM-LoRA addresses this by aligning the adapter's activation boundaries with those of the pretrained model before downstream training, thereby maximizing the projection of full-parameter gradients into the adapter subspace. This alignment sharply reduces information loss at initialization, yields a lower starting loss, and accelerates convergence. We demonstrate ABM-LoRA's effectiveness across diverse architectures and tasks: language understanding (T5-Base on GLUE), dialogue generation (LLaMA2-7B on WizardLM), and vision recognition (ViT-B/16 on VTAB-1K). On VTAB-1K, it achieves the highest accuracy among all methods, with strong gains on structured reasoning tasks requiring geometric understanding.

Problem

Research questions and friction points this paper is trying to address.

Accelerates convergence of low-rank adapters through principled initialization

Reduces information loss by aligning activation boundaries before training

Improves performance across language, dialogue, and vision tasks

Innovation

Methods, ideas, or system contributions that make the work stand out.

ABM-LoRA aligns adapter activation boundaries with pretrained model

It maximizes gradient projection into adapter subspace

This reduces information loss and accelerates convergence

🔎 Similar Papers

No similar papers found.