iConFormer: Dynamic Parameter-Efficient Tuning with Input-Conditioned Adaptation

📅 2024-09-04

🏛️ arXiv.org

📈 Citations: 1

✨ Influential: 0

🤖 AI Summary

To address the lack of input awareness and task-specific modeling capability in existing parameter-efficient fine-tuning (PEFT) adapters, this paper proposes a dynamic input-conditioned Transformer architecture. The core innovation is the input-Conditioned Network (iCoN), which generates instance-specific, channel-wise dynamic convolutional kernels for fine-grained, input-adaptive feature modulation. Our method fine-tunes only 1.6%–2.8% of the backbone parameters, yet achieves full fine-tuning performance on depth estimation and semantic segmentation, and significantly outperforms full fine-tuning on image classification and instance segmentation. It consistently surpasses mainstream PEFT approaches—including LoRA and Adapter—across diverse downstream tasks. By enabling input-aware, task-adaptive representation learning with minimal parameter overhead, the proposed method substantially enhances the generalization capability and expressive power of PEFT across heterogeneous vision tasks.

Technology Category

Application Category

📝 Abstract

Transfer learning based on full fine-tuning (FFT) of the pre-trained encoder and task-specific decoder becomes increasingly complex as deep models grow exponentially. Parameter efficient fine-tuning (PEFT) approaches using adapters consisting of small learnable layers have emerged as an alternative to FFT, achieving comparable performance while maintaining high training efficiency. However, the inflexibility of the adapter with respect to input instances limits its capability of learning task-specific information in diverse downstream tasks. In this paper, we propose a novel PEFT approach, input-Conditioned transFormer, termed iConFormer, that leverages a dynamic adapter conditioned on the input instances. To secure flexible learning ability on input instances in various downstream tasks, we introduce an input-Conditioned Network (iCoN) in the dynamic adapter that enables instance-level feature transformation. To be specific, iCoN generates channel-wise convolutional kernels for each feature and transform it using adaptive convolution process to effectively capture task-specific and fine-grained details tailor to downstream tasks. Experimental results demonstrate that by tuning just 1.6% to 2.8% of the Transformer backbone parameters, iConFormer achieves performance comparable to FFT in monocular depth estimation and semantic segmentation, while outperforming it in image classification and instance segmentation. Also, the proposed method consistently outperforms recent PEFT methods for all the tasks mentioned above.

Problem

Research questions and friction points this paper is trying to address.

Dynamic adapter for input-conditioned adaptation in PEFT

Improving task-specific learning in diverse downstream tasks

Achieving FFT performance with minimal parameter tuning

Innovation

Methods, ideas, or system contributions that make the work stand out.

Dynamic adapter conditioned on input instances

Input-Conditioned Network for feature transformation

Adaptive convolution for task-specific details

🔎 Similar Papers

No similar papers found.

Authors to Follow