Rethinking Adapter Placement: A Dominant Adaptation Module Perspective

📅 2026-05-07

📈 Citations: 0

✨ Influential: 0

career value

183K/year

🤖 AI Summary

This work addresses a central challenge in parameter-efficient fine-tuning: identifying the optimal placement of adapters to achieve peak performance with minimal parameters. The authors propose PAGE, a metric based on initial gradient energy analysis, which reveals that adaptation effects are highly concentrated in the down-projection modules of shallow feed-forward networks. Leveraging this insight, they introduce DomLoRA—a method that deploys a single LoRA adapter exclusively in this dominant module. This study is the first to demonstrate the existence, architectural dependency, and task stability of such a dominant adaptation module, establishing a new paradigm for efficient fine-tuning. Experiments show that DomLoRA, using only ~0.7% of the parameters of standard LoRA, consistently outperforms it across diverse tasks—including instruction following, mathematical reasoning, code generation, and multi-turn dialogue—and further enhances the effectiveness of other LoRA variants.

📝 Abstract

Low-rank adaptation (LoRA) is a widely used parameter-efficient fine-tuning method that places trainable low-rank adapters into frozen pre-trained models. Recent studies show that using fewer LoRA adapters may still maintain or even improve performance, but existing methods still distribute adapters broadly, leaving where to place a limited number of adapters to maximize performance largely open. To investigate this, we introduce PAGE (Projected Adapter Gradient Energy), a gradient-based sensitivity probe that estimates the initial trainable gradient energy available to each candidate LoRA adapter. Surprisingly, we find that PAGE is highly concentrated on a single shallow FFN down-projection across two model families and four downstream tasks. We term this module the dominant adaptation module and show that its layer index is architecture-dependent but task-stable. Motivated by this finding, we propose DomLoRA, a placement method that places a single adapter at the dominant adaptation module. With only ~0.7% of vanilla LoRA's trainable parameters, DomLoRA outperforms it on average across various downstream tasks, including instruction following, mathematical reasoning, code generation, and multi-turn conversation. This method also improves other LoRA variants, supporting the dominant adaptation module perspective as a practical placement guideline.

Problem

Research questions and friction points this paper is trying to address.

LoRA

adapter placement

parameter-efficient fine-tuning

dominant adaptation module

gradient energy

Innovation

Methods, ideas, or system contributions that make the work stand out.

LoRA

adapter placement

gradient energy