Few-Shot, Now for Real: Medical VLMs Adaptation without Balanced Sets or Validation

📅 2025-06-20

📈 Citations: 0

✨ Influential: 0

career value

186K/year

🤖 AI Summary

Existing few-shot adaptation paradigms for medical vision-language models (VLMs) rely on balanced support sets and auxiliary validation sets—contradicting real-world clinical settings characterized by inherently imbalanced disease distributions and extreme scarcity of labeled data. This work proposes the first support-set-balancing-free and validation-free few-shot adaptation framework. Our approach introduces three core innovations: (1) a training-free adaptive linear probe that dynamically fuses visual and textual supervision signals; (2) a cross-modal feature alignment mechanism; and (3) a robust adaptation strategy explicitly designed for class imbalance. Evaluated across multimodal medical benchmarks, our method consistently outperforms state-of-the-art approaches, avoids zero-shot performance degradation, and maintains stable high accuracy under ultra-low-data regimes. It thus delivers a more practical, efficient, and clinically trustworthy solution for few-shot adaptation in real-world medical AI deployment.

Technology Category

Application Category

📝 Abstract

Vision-language models (VLMs) are gaining attention in medical image analysis. These are pre-trained on large, heterogeneous data sources, yielding rich and transferable representations. Notably, the combination of modality-specialized VLMs with few-shot adaptation has provided fruitful results, enabling the efficient deployment of high-performing solutions. However, previous works on this topic make strong assumptions about the distribution of adaptation data, which are unrealistic in the medical domain. First, prior art assumes access to a balanced support set, a condition that breaks the natural imbalance in disease prevalence found in real-world scenarios. Second, these works typically assume the presence of an additional validation set to fix critical hyper-parameters, which is highly data-inefficient. This work challenges these favorable deployment scenarios and introduces a realistic, imbalanced, validation-free adaptation setting. Our extensive benchmark across various modalities and downstream tasks demonstrates that current methods systematically compromise their performance when operating under realistic conditions, occasionally even performing worse than zero-shot inference. Also, we introduce a training-free linear probe that adaptively blends visual and textual supervision. Detailed studies demonstrate that the proposed solver is a strong, efficient baseline, enabling robust adaptation in challenging scenarios.

Problem

Research questions and friction points this paper is trying to address.

Adapts medical VLMs without balanced data sets

Eliminates need for validation sets in adaptation

Addresses performance drop in realistic imbalanced scenarios

Innovation

Methods, ideas, or system contributions that make the work stand out.

Few-shot adaptation for medical VLMs

Training-free linear probe method

Robust adaptation without balanced data

🔎 Similar Papers

No similar papers found.