🤖 AI Summary
In multi-task learning, standard LoRA suffers from task interference and confusion due to its rigid constraint that all tasks share a single low-rank intrinsic subspace. To address this, we propose Task-Adaptive LoRA (TA-LoRA), the first LoRA variant featuring task-adaptive low-rank branches. TA-LoRA introduces learnable task identifier embeddings and gated low-rank projection matrices to explicitly decouple task-specific subspaces from cross-task shared subspaces. This enables parameter-efficient, task-aware joint adaptation without increasing inference overhead. Evaluated across diverse benchmarks—including natural language understanding, commonsense reasoning, vision-language comprehension, and industrial ad relevance ranking—TA-LoRA consistently outperforms LoRA and its variants under equivalent or fewer trainable parameters, achieving state-of-the-art multi-task performance while preserving LoRA’s parameter efficiency.
📝 Abstract
Parameter-efficient fine-tuning (PEFT) has been widely employed for domain adaptation, with LoRA being one of the most prominent methods due to its simplicity and effectiveness. However, in multi-task learning (MTL) scenarios, LoRA tends to obscure the distinction between tasks by projecting sparse high-dimensional features from different tasks into the same dense low-dimensional intrinsic space. This leads to task interference and suboptimal performance for LoRA and its variants. To tackle this challenge, we propose MTL-LoRA, which retains the advantages of low-rank adaptation while significantly enhancing MTL capabilities. MTL-LoRA augments LoRA by incorporating additional task-adaptive parameters that differentiate task-specific information and capture shared knowledge across various tasks within low-dimensional spaces. This approach enables pre-trained models to jointly adapt to different target domains with a limited number of trainable parameters. Comprehensive experimental results, including evaluations on public academic benchmarks for natural language understanding, commonsense reasoning, and image-text understanding, as well as real-world industrial text Ads relevance datasets, demonstrate that MTL-LoRA outperforms LoRA and its various variants with comparable or even fewer learnable parameters in MTL setting.