🤖 AI Summary
In RGB-T tracking, low-rank adaptation suffers from rank-space redundancy—numerous singular values contribute minimally, limiting model diversity and multimodal representation capacity. To address this, we propose Group-wise Orthogonal Low-Rank Adaptation (GOLA). GOLA first quantifies the importance of individual ranks via singular value decomposition, freezes the most critical ones, and clusters redundant ranks into groups. It then enforces inter-group orthogonality constraints to encourage complementary feature learning across rank groups. By structuring parameter updates in this manner, GOLA achieves substantial gains in multimodal representation fidelity and task adaptability while tuning only a tiny fraction of parameters. Evaluated on four RGB-T tracking benchmarks, GOLA attains state-of-the-art performance with significantly fewer trainable parameters, demonstrating its effectiveness in mitigating rank redundancy and enhancing feature discriminability.
📝 Abstract
Parameter-efficient fine-tuning has emerged as a promising paradigm in RGB-T tracking, enabling downstream task adaptation by freezing pretrained parameters and fine-tuning only a small set of parameters. This set forms a rank space made up of multiple individual ranks, whose expressiveness directly shapes the model's adaptability. However, quantitative analysis reveals low-rank adaptation exhibits significant redundancy in the rank space, with many ranks contributing almost no practical information. This hinders the model's ability to learn more diverse knowledge to address the various challenges in RGB-T tracking. To address this issue, we propose the Group Orthogonal Low-Rank Adaptation (GOLA) framework for RGB-T tracking, which effectively leverages the rank space through structured parameter learning. Specifically, we adopt a rank decomposition partitioning strategy utilizing singular value decomposition to quantify rank importance, freeze crucial ranks to preserve the pretrained priors, and cluster the redundant ranks into groups to prepare for subsequent orthogonal constraints. We further design an inter-group orthogonal constraint strategy. This constraint enforces orthogonality between rank groups, compelling them to learn complementary features that target diverse challenges, thereby alleviating information redundancy. Experimental results demonstrate that GOLA effectively reduces parameter redundancy and enhances feature representation capabilities, significantly outperforming state-of-the-art methods across four benchmark datasets and validating its effectiveness in RGB-T tracking tasks.