🤖 AI Summary
Existing LoRA methods treat low-rank updates for individual layers and attention branches (Q/K/V) as independent matrices, neglecting structural correlations across layers and branches. To address this, we propose TensLoRA—a systematic tensorized low-rank adaptation framework. TensLoRA unifies all LoRA updates into a high-order tensor representation and leverages tensor decomposition to jointly model inter-layer and inter-branch dependencies. This formulation enables modality-aware customization of compression ratios, enhancing both parameter allocation flexibility and efficiency. Evaluated on multimodal vision-language benchmarks, TensLoRA consistently outperforms standard LoRA at equivalent parameter counts, demonstrating that explicit tensor structure intrinsically strengthens adaptation capability. Our work establishes the first principled tensor-based approach to low-rank adaptation, bridging structural modeling gaps in parameter-efficient fine-tuning.
📝 Abstract
Low-Rank Adaptation (LoRA) is widely used to efficiently adapt Transformers by adding trainable low-rank matrices to attention projections. While effective, these matrices are considered independent for each attention projection (Query, Key, and Value) and each layer. Recent extensions have considered joint, tensor-based adaptations, but only in limited forms and without a systematic framework. We introduce TensLoRA, a unified framework that aggregates LoRA updates into higher-order tensors and models a broad family of tensor-based low-rank adaptations. Our formulation generalizes existing tensor-based methods and enables mode-specific compression rates, allowing parameter budgets to be tailored according to the modality and task. Experiments on vision and language benchmarks reveal that the tensor construction directly impacts performance, sometimes better than standard LoRA under similar parameter counts.