From Embeddings to Accuracy: Comparing Foundation Models for Radiographic Classification

📅 2025-05-16

📈 Citations: 0

✨ Influential: 0

career value

185K/year

🤖 AI Summary

This study addresses the seven-class task of catheter placement assessment in chest X-ray images. We systematically evaluate the transferability of image embeddings extracted from six general-purpose and medical-domain foundation models—including MedImageInsight, Rad-DINO, and CXR-Foundation—when paired with lightweight adapters (SVM, Random Forest). To our knowledge, this is the first work to conduct cross-model and cross-population fairness analysis of multi-source foundation model embeddings for tube localization diagnosis. Results show that MedImageInsight combined with SVM achieves 93.8% mean AUC, significantly outperforming baselines; all adapters train in under one minute and enable CPU-based inference in less than one second; inter-gender performance disparity is <2%, and AUC standard deviation across age groups is ≤3%, demonstrating high accuracy, low latency, and clinical-grade fairness. The core contribution lies in empirically validating the efficacy and robustness of the “medical pre-trained embedding + lightweight adapter” paradigm in resource-constrained clinical settings.

Technology Category

Application Category

📝 Abstract

Foundation models, pretrained on extensive datasets, have significantly advanced machine learning by providing robust and transferable embeddings applicable to various domains, including medical imaging diagnostics. This study evaluates the utility of embeddings derived from both general-purpose and medical domain-specific foundation models for training lightweight adapter models in multi-class radiography classification, focusing specifically on tube placement assessment. A dataset comprising 8842 radiographs classified into seven distinct categories was employed to extract embeddings using six foundation models: DenseNet121, BiomedCLIP, Med-Flamingo, MedImageInsight, Rad-DINO, and CXR-Foundation. Adapter models were subsequently trained using classical machine learning algorithms. Among these combinations, MedImageInsight embeddings paired with an support vector machine adapter yielded the highest mean area under the curve (mAUC) at 93.8%, followed closely by Rad-DINO (91.1%) and CXR-Foundation (89.0%). In comparison, BiomedCLIP and DenseNet121 exhibited moderate performance with mAUC scores of 83.0% and 81.8%, respectively, whereas Med-Flamingo delivered the lowest performance at 75.1%. Notably, most adapter models demonstrated computational efficiency, achieving training within one minute and inference within seconds on CPU, underscoring their practicality for clinical applications. Furthermore, fairness analyses on adapters trained on MedImageInsight-derived embeddings indicated minimal disparities, with gender differences in performance within 2% and standard deviations across age groups not exceeding 3%. These findings confirm that foundation model embeddings-especially those from MedImageInsight-facilitate accurate, computationally efficient, and equitable diagnostic classification using lightweight adapters for radiographic image analysis.

Problem

Research questions and friction points this paper is trying to address.

Evaluating foundation model embeddings for radiographic classification accuracy

Comparing performance of medical vs general-purpose embeddings in tube placement assessment

Assessing computational efficiency and fairness of lightweight adapter models

Innovation

Methods, ideas, or system contributions that make the work stand out.

Uses foundation models for radiographic classification

Trains lightweight adapter models efficiently

Achieves high accuracy with MedImageInsight embeddings

🔎 Similar Papers

Leveraging Foundation Models for Content-Based Medical Image Retrieval in Radiology