A$_3$B$_2$: Adaptive Asymmetric Adapter for Alleviating Branch Bias in Vision-Language Image Classification with Few-Shot Learning

📅 2026-05-13
📈 Citations: 0
Influential: 0
📄 PDF

career value

215K/year
🤖 AI Summary
This work addresses the branch bias problem in existing vision-language models for few-shot image classification, where the default assumption of equal importance between visual and textual branches can degrade performance under distribution shifts when fine-tuning the image encoder. The study systematically uncovers this issue and proposes A₃B₂, a novel approach that employs an uncertainty-aware adapter damping mechanism to automatically suppress updates to the image branch under high prediction uncertainty. Coupled with a lightweight asymmetric mixture-of-experts architecture and load-balancing regularization, A₃B₂ enables data-driven, adaptive modulation of branch weights. Evaluated across 11 datasets spanning three few-shot task settings, A₃B₂ consistently outperforms 11 state-of-the-art prompt- and adapter-based baselines, demonstrating superior generalization capability.
📝 Abstract
Efficient transfer learning methods for large-scale vision-language models ($e.g.$, CLIP) enable strong few-shot transfer, yet existing adaptation methods follow a fixed fine-tuning paradigm that implicitly assumes a uniform importance of the image and text branches, which has not been systematically studied in image classification. Through extensive analysis, we reveal a Branch Bias issue in vision-language image classification: adapting the image encoder does not always improve performance under out-of-distribution settings. Motivated by this observation, we propose A$_3$B$_2$, an Adaptive Asymmetric Adapter that alleviates Branch Bias in few-shot learning. A$_3$B$_2$ introduces Uncertainty-Aware Adapter Dampening (UAAD), which automatically suppresses image-branch adaptation when prediction uncertainty is high, enabling soft and data-driven control without manual intervention. Architecturally, A$_3$B$_2$ adopts a lightweight asymmetric design inspired by mixture-of-experts with Load Balancing Regularization. Extensive experiments on three few-shot image classification tasks across 11 datasets demonstrate that A$_3$B$_2$ consistently outperforms 11 competitive prompt- and adapter-based baselines.
Problem

Research questions and friction points this paper is trying to address.

Branch Bias
Vision-Language Models
Few-Shot Learning
Image Classification
Transfer Learning
Innovation

Methods, ideas, or system contributions that make the work stand out.

Branch Bias
Adaptive Asymmetric Adapter
Uncertainty-Aware Adapter Dampening
Few-Shot Learning
Vision-Language Models
🔎 Similar Papers
No similar papers found.