Delving into Out-of-Distribution Detection with Medical Vision-Language Models

📅 2025-03-02
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the underexplored yet critical challenge of out-of-distribution (OOD) detection for medical vision-language models (VLMs) operating on highly variable and noisy clinical imaging data. We present the first systematic study of OOD detection in medical VLMs. Our method introduces a hierarchical prompting mechanism to enhance cross-modal semantic discrimination and establishes a comprehensive, multi-faceted OOD evaluation framework covering both semantic and covariate shifts. Integrating zero-shot inference, prompt engineering, and OOD confidence calibration, our approach enables distributional distance modeling and semantic alignment across diverse general-purpose and domain-specific medical VLMs. Extensive experiments on multiple medical imaging benchmarks demonstrate significant improvements over existing VLM-based OOD detection methods. The code is publicly released, establishing a new paradigm for trustworthy AI in healthcare.

Technology Category

Application Category

📝 Abstract
Recent advances in medical vision-language models (VLMs) demonstrate impressive performance in image classification tasks, driven by their strong zero-shot generalization capabilities. However, given the high variability and complexity inherent in medical imaging data, the ability of these models to detect out-of-distribution (OOD) data in this domain remains underexplored. In this work, we conduct the first systematic investigation into the OOD detection potential of medical VLMs. We evaluate state-of-the-art VLM-based OOD detection methods across a diverse set of medical VLMs, including both general and domain-specific purposes. To accurately reflect real-world challenges, we introduce a cross-modality evaluation pipeline for benchmarking full-spectrum OOD detection, rigorously assessing model robustness against both semantic shifts and covariate shifts. Furthermore, we propose a novel hierarchical prompt-based method that significantly enhances OOD detection performance. Extensive experiments are conducted to validate the effectiveness of our approach. The codes are available at https://github.com/PyJulie/Medical-VLMs-OOD-Detection.
Problem

Research questions and friction points this paper is trying to address.

Exploring out-of-distribution detection in medical vision-language models.
Assessing robustness against semantic and covariate shifts in medical imaging.
Proposing a hierarchical prompt-based method to improve OOD detection.
Innovation

Methods, ideas, or system contributions that make the work stand out.

Cross-modality evaluation pipeline for OOD detection
Hierarchical prompt-based method enhances detection
Systematic investigation of medical VLMs' OOD potential
🔎 Similar Papers
No similar papers found.
Lie Ju
Lie Ju
University College London; Moorfields Eye Hospital; Monash University
Computer VisionMedical Image AnalysisOphthalmology
Sijin Zhou
Sijin Zhou
Monash Unversity
Computer visionMultimodalFederated learningMedical image processing
Y
Yukun Zhou
Moorfields Eye Hospital, United Kingdom; University College London, United Kingdom
Huimin Lu
Huimin Lu
National University of Defense Technology
Robot VisionMulti-robot CoordinationRobot SoccerRobot Rescue
Zhuoting Zhu
Zhuoting Zhu
Melbourne University, Australia
P
Pearse A. Keane
Moorfields Eye Hospital, United Kingdom; University College London, United Kingdom
Z
Zongyuan Ge
Monash University, Australia; Airdoc Technology Inc, China