MadCLIP: Few-shot Medical Anomaly Detection with CLIP

📅 2025-06-30
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses few-shot anomaly detection in medical imaging, tackling both image-level anomaly classification (AC) and pixel-level anomaly segmentation (AS), without relying on synthetic data or external memory banks. We propose a CLIP-based dual-branch adaptation architecture: the visual encoder incorporates learnable adapters and textual prompts to enhance vision–language alignment; we introduce SigLIP loss—novelly applied to medical imaging—to model unpaired image–text relationships; and the dual-branch design explicitly decouples normal and abnormal feature representations. Our method achieves state-of-the-art performance across both cross-dataset and within-dataset benchmarks on multimodal medical data. Ablation studies validate the efficacy of each component. Key contributions include: (i) the first CLIP-based few-shot anomaly detection framework for medical imaging that requires neither synthetic data nor external memory; (ii) the first application of SigLIP loss in medical anomaly detection; and (iii) a dual-branch disentangled architecture that improves few-shot generalization.

Technology Category

Application Category

📝 Abstract
An innovative few-shot anomaly detection approach is presented, leveraging the pre-trained CLIP model for medical data, and adapting it for both image-level anomaly classification (AC) and pixel-level anomaly segmentation (AS). A dual-branch design is proposed to separately capture normal and abnormal features through learnable adapters in the CLIP vision encoder. To improve semantic alignment, learnable text prompts are employed to link visual features. Furthermore, SigLIP loss is applied to effectively handle the many-to-one relationship between images and unpaired text prompts, showcasing its adaptation in the medical field for the first time. Our approach is validated on multiple modalities, demonstrating superior performance over existing methods for AC and AS, in both same-dataset and cross-dataset evaluations. Unlike prior work, it does not rely on synthetic data or memory banks, and an ablation study confirms the contribution of each component. The code is available at https://github.com/mahshid1998/MadCLIP.
Problem

Research questions and friction points this paper is trying to address.

Few-shot anomaly detection in medical images using CLIP
Dual-branch design for normal and abnormal feature capture
Improving semantic alignment with learnable text prompts
Innovation

Methods, ideas, or system contributions that make the work stand out.

Leverages pre-trained CLIP for medical anomaly detection
Uses dual-branch design with learnable adapters
Employs learnable text prompts for semantic alignment
🔎 Similar Papers
No similar papers found.