Leveraging Foundation Models for Content-Based Medical Image Retrieval in Radiology

📅 2024-03-11
🏛️ arXiv.org
📈 Citations: 8
Influential: 1
📄 PDF
🤖 AI Summary
Existing radiological content-based image retrieval (CBIR) systems are typically disease-specific, exhibiting poor generalizability across pathologies and imaging modalities. To address this limitation, we propose the first general-purpose medical image retrieval paradigm leveraging weakly supervised vision foundation models—specifically ViT, CLIP, and DINOv2—without fine-tuning. Our approach supports cross-modal retrieval across four imaging modalities (e.g., X-ray, CT, MRI, ultrasound) and 161 distinct pathological categories. We rigorously evaluate it on a large-scale dataset of 1.6 million 2D radiological images, achieving a top-1 precision (P@1) of up to 0.594—comparable to state-of-the-art task-specific models. Crucially, we identify and empirically validate that retrieving pathology-related features is fundamentally more challenging than retrieving anatomical structures—a previously unreported insight. Our results demonstrate that foundation models exhibit strong generalization capability in large-scale, multi-disease CBIR, paving a novel pathway toward universal medical image retrieval systems.

Technology Category

Application Category

📝 Abstract
Content-based image retrieval (CBIR) has the potential to significantly improve diagnostic aid and medical research in radiology. Current CBIR systems face limitations due to their specialization to certain pathologies, limiting their utility. In response, we propose using vision foundation models as powerful and versatile off-the-shelf feature extractors for content-based medical image retrieval. By benchmarking these models on a comprehensive dataset of 1.6 million 2D radiological images spanning four modalities and 161 pathologies, we identify weakly-supervised models as superior, achieving a P@1 of up to 0.594. This performance not only competes with a specialized model but does so without the need for fine-tuning. Our analysis further explores the challenges in retrieving pathological versus anatomical structures, indicating that accurate retrieval of pathological features presents greater difficulty. Despite these challenges, our research underscores the vast potential of foundation models for CBIR in radiology, proposing a shift towards versatile, general-purpose medical image retrieval systems that do not require specific tuning.
Problem

Research questions and friction points this paper is trying to address.

Enhancing radiology diagnostics with foundation models for CBIR
Overcoming limitations of specialized CBIR systems with versatile models
Evaluating foundation models' performance on diverse radiological images
Innovation

Methods, ideas, or system contributions that make the work stand out.

Using vision foundation models for image retrieval
Benchmarking models on 1.6M radiological images
Identifying BiomedCLIP as highly effective model
🔎 Similar Papers
No similar papers found.
Stefan Denner
Stefan Denner
German Cancer Research Center
Deep LearningComputer VisionMachine LearningMedical Imaging
David Zimmerer
David Zimmerer
German Cancer Research Center (DKFZ)
D
Dimitrios Bounias
Division of Medical Image Computing, German Cancer Research Center (DKFZ), Heidelberg, Germany; Medical Faculty Heidelberg, Heidelberg University, Heidelberg, Germany
Markus Bujotzek
Markus Bujotzek
PhD Student, Department of Medical Image Computing, German Cancer Research Center Heidelberg, German
Medical Image ComputingFederated LearningSemantic Segmentation
S
Shuhan Xiao
Division of Medical Image Computing, German Cancer Research Center (DKFZ), Heidelberg, Germany; Faculty of Mathematics and Computer Science, Heidelberg University, Heidelberg, Germany
L
Lisa Kausch
Division of Medical Image Computing, German Cancer Research Center (DKFZ), Heidelberg, Germany
P
Philipp Schader
Division of Medical Image Computing, German Cancer Research Center (DKFZ), Heidelberg, Germany; Faculty of Mathematics and Computer Science, Heidelberg University, Heidelberg, Germany
T
Tobias Penzkofer
Department of Radiology, Charité - Universitätsmedizin Berlin, Berlin, Germany
P
Paul F. Jäger
Interactive Machine Learning Group (IML), German Cancer Research Center (DKFZ), Heidelberg, Germany
K
Klaus Maier-Hein
Division of Medical Image Computing, German Cancer Research Center (DKFZ), Heidelberg, Germany; Helmholtz Imaging, German Cancer Research Center (DKFZ), Heidelberg, Germany. Pattern Analysis and Learning Group, Department of Radiation Oncology, Heidelberg University Hospital, Heidelberg, Germany; National Center for Tumor Diseases (NCT) Heidelberg, Germany