🤖 AI Summary
Existing parameter-efficient fine-tuning (PEFT) methods struggle to match the performance of full fine-tuning in dense prediction tasks due to input-agnostic modeling and redundant cross-layer representations. This work proposes AdaRoute, a dynamic parameter routing mechanism based on a shared expert pool. During forward propagation, AdaRoute dynamically generates input-adaptive low-rank weights for each module, enabling customized feature representation. By sharing experts across layers, it facilitates implicit feature interaction and enhances representational diversity. Integrating concepts from mixture-of-experts (MoE), dynamic routing, and low-rank adaptation, AdaRoute significantly outperforms existing PEFT approaches across a range of dense prediction tasks, including semantic segmentation, object detection, instance segmentation, and panoptic segmentation.
📝 Abstract
Adapting pre-trained vision models using parameter-efficient fine-tuning (PEFT) remains challenging, as it aims to achieve performance comparable to full fine-tuning using a minimal number of trainable parameters. When applied to complex dense prediction tasks, existing methods exhibit limitations, including input-agnostic modeling and redundant cross-layer representations. To this end, we propose AdaRoute, a new adapter-style method featuring a simple mixture-of-experts (MoE) architecture. Specifically, we introduce shared expert centers, where each expert is a trainable parameter matrix. During a feedforward pass, each AdaRoute module in the network dynamically generates weight matrices tailored for the current module via a simple dynamic parameter routing mechanism, which selectively aggregates parameter matrices in the corresponding expert center. Dynamic weight matrices in AdaRoute modules facilitate low-rank adaptation in an input-dependent manner, thus generating more customized and powerful feature representations. Moreover, since AdaRoute modules across multiple network layers share the same expert center, they improve feature diversity by promoting implicit cross-layer feature interaction. Extensive experiments demonstrate the superiority of AdaRoute on diverse vision tasks, including semantic segmentation, object detection and instance segmentation, and panoptic segmentation. Code will be available at: https://bit.ly/3NZcr0H.