AuroRA: Breaking Low-Rank Bottleneck of LoRA with Nonlinear Mapping

📅 2025-05-24
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
LoRA’s purely linear structure suffers from representational bottlenecks; increasing rank to approximate full fine-tuning incurs prohibitive parameter overhead. To address this, we propose AuroRA—the first method to insert a learnable Adaptive Nonlinear Layer (ANL) between LoRA’s bilinear projections, yielding a low-rank yet highly expressive MLP-like architecture that breaks linear limitations while preserving parameter efficiency. We theoretically establish that AuroRA achieves lower function approximation error and bounded gradients compared to standard LoRA. Extensive experiments across 22 datasets and six pretrained models—including LLMs and vision models—demonstrate that AuroRA surpasses full fine-tuning using only 6.18%–25% of LoRA’s parameters. It outperforms state-of-the-art PEFT methods by up to 10.88% on NLP and CV tasks and exhibits strong robustness to rank configuration.

Technology Category

Application Category

📝 Abstract
Low-Rank Adaptation (LoRA) is a widely adopted parameter-efficient fine-tuning (PEFT) method validated across NLP and CV domains. However, LoRA faces an inherent low-rank bottleneck: narrowing its performance gap with full finetuning requires increasing the rank of its parameter matrix, resulting in significant parameter overhead. Recent linear LoRA variants have attempted to enhance expressiveness by introducing additional linear mappings; however, their composition remains inherently linear and fails to fundamentally improve LoRA's representational capacity. To address this limitation, we propose AuroRA, which incorporates an Adaptive Nonlinear Layer (ANL) between two linear projectors to capture fixed and learnable nonlinearities. This combination forms an MLP-like structure with a compressed rank, enabling flexible and precise approximation of diverse target functions while theoretically guaranteeing lower approximation errors and bounded gradients. Extensive experiments on 22 datasets and 6 pretrained models demonstrate that AuroRA: (I) not only matches or surpasses full fine-tuning performance with only 6.18% ~ 25% of LoRA's parameters but also (II) outperforms state-of-the-art PEFT methods by up to 10.88% in both NLP and CV tasks, and (III) exhibits robust performance across various rank configurations.
Problem

Research questions and friction points this paper is trying to address.

Overcoming LoRA's low-rank bottleneck with nonlinear mapping
Reducing parameter overhead while enhancing model expressiveness
Improving performance in NLP and CV tasks efficiently
Innovation

Methods, ideas, or system contributions that make the work stand out.

Introduces Adaptive Nonlinear Layer (ANL)
Combines linear projectors with nonlinear mapping
Achieves high performance with fewer parameters
🔎 Similar Papers
No similar papers found.
H
Haonan Dong
State Key Laboratory of General Artificial Intelligence, School of Intelligence Science and Technology, Peking University
Wenhao Zhu
Wenhao Zhu
ByteDance Seed
Large Language ModelMachine Translation
Guojie Song
Guojie Song
Professor (Research), Tenured of Peking University
Psychological AIAI Safe & Value AlignmentAgent Cognition & Behavioral ModelingLLM&GML
L
Liang Wang
Alibaba Group