🤖 AI Summary
Multimodal large language models (MLLMs) exhibit significant demographic biases—e.g., along gender and race dimensions—in medical image reasoning, yet existing debiasing methods rely on large-scale annotated data or model fine-tuning, limiting compatibility with foundational LLMs. To address this, we propose a fine-tuning-free, annotation-free fair in-context learning framework. Its core innovation is the Fairness-Aware Demonstration Selection (FADS) mechanism, which—novelly for multimodal contexts—integrates clustering-based sampling to jointly optimize demographic balance and semantic relevance. FADS combines clustering-driven exemplar selection, cross-modal semantic matching, and demographic balancing strategies. Evaluated across multiple medical imaging benchmarks, our method reduces inter-group performance disparities by an average of 38% while preserving diagnostic accuracy. Results demonstrate strong efficacy, generalizability across tasks and datasets, and deployment friendliness—enabling fairer MLLM inference without architectural or training modifications.
📝 Abstract
Multimodal large language models (MLLMs) have shown strong potential for medical image reasoning, yet fairness across demographic groups remains a major concern. Existing debiasing methods often rely on large labeled datasets or fine-tuning, which are impractical for foundation-scale models. We explore In-Context Learning (ICL) as a lightweight, tuning-free alternative for improving fairness. Through systematic analysis, we find that conventional demonstration selection (DS) strategies fail to ensure fairness due to demographic imbalance in selected exemplars. To address this, we propose Fairness-Aware Demonstration Selection (FADS), which builds demographically balanced and semantically relevant demonstrations via clustering-based sampling. Experiments on multiple medical imaging benchmarks show that FADS consistently reduces gender-, race-, and ethnicity-related disparities while maintaining strong accuracy, offering an efficient and scalable path toward fair medical image reasoning. These results highlight the potential of fairness-aware in-context learning as a scalable and data-efficient solution for equitable medical image reasoning.