🤖 AI Summary
This work addresses the limitations of existing low-rank adaptation methods, such as LoRA, which rely on a fixed pre-specified rank and struggle to balance efficiency with performance, as well as dynamic-rank approaches that suffer from inconsistent gradient signals, leading to poor high-rank performance and low data efficiency. Inspired by matryoshka dolls, we propose a hierarchical low-rank adaptation framework that introduces a learnable diagonal scaling matrix \( P \), unifying LoRA and DyLoRA as special cases while ensuring consistent gradient flow across all sub-ranks. Coupled with a dynamic rank sampling strategy, our method enables efficient training. To comprehensively evaluate hierarchical adaptation, we introduce a new metric, AURAC. Experiments demonstrate that our approach significantly outperforms existing rank-adaptive methods across multiple datasets, maintaining high accuracy across varying ranks and achieving a superior trade-off between accuracy and efficiency.
📝 Abstract
With the rise in scale for deep learning models to billions of parameters, the computational cost of fine-tuning remains a significant barrier to deployment. While Low-Rank Adaptation (LoRA) has become the standard for parameter-efficient fine-tuning, the need to set a predefined, static rank $r$ requires exhaustive grid searches to balance efficiency and performance. Existing rank-adaptive solutions such as DyLoRA mitigate this by sampling ranks during the training from a predefined distribution. However, they often yield sub-optimal results at higher ranks due to lack of consistent gradient signals across the full hierarchy of ranks, thus making these methods data-inefficient. In this paper, we propose MatryoshkaLoRA, a general, Matryoshka-inspired training framework for LoRA that learns accurate hierarchical low-rank representations by inserting a fixed, carefully crafted diagonal matrix $P$ between the existing LoRA adapters to scale their sub-ranks accordingly. By introducing this simple modification, our general framework recovers LoRA and DyLoRA only by changing $P$ and ensures all sub-ranks embed the available gradient information efficiently. Our MatryoshkaLoRA supports dynamic rank selection with minimal degradation in accuracy. We further propose Area Under the Rank Accuracy Curve (AURAC), a metric that consistently evaluates the performance of hierarchical low-rank adapters. Our results demonstrate that MatryoshkaLoRA learns more accurate hierarchical low-rank representations than prior rank-adaptive approaches and achieves superior accuracy-performance trade-offs across ranks on the evaluated datasets. Our code is available at https://github.com/IST-DASLab/MatryoshkaLoRA.