A data- and compute-efficient chest X-ray foundation model beyond aggressive scaling

📅 2026-02-26
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the challenges of redundancy, class imbalance, and computational inefficiency commonly encountered in medical foundation models due to their reliance on large-scale pretraining data. The authors propose CheXficient, an active learning–driven intelligent data curation strategy that significantly enhances generalization on rare pathologies while utilizing only 22.7% of chest X-ray–report pairs and less than 27.3% of the computational budget during pretraining. Integrated within a vision–language pretraining framework, CheXficient supports zero-shot classification, cross-modal retrieval, and diverse downstream tasks. Evaluated across 20 benchmarks spanning five task categories, CheXficient matches or surpasses models trained on full datasets, demonstrating particularly strong performance in long-tailed and rare disease scenarios.

Technology Category

Application Category

📝 Abstract
Foundation models for medical imaging are typically pretrained on increasingly large datasets, following a "scale-at-all-costs" paradigm. However, this strategy faces two critical challenges: large-scale medical datasets often contain substantial redundancy and severe class imbalance that bias representation learning toward over-represented patterns, and indiscriminate training regardless of heterogeneity in data quality incurs considerable computational inefficiency. Here we demonstrate that active, principled data curation during pretraining can serve as a viable, cost-effective alternative to brute-force dataset enlargement. We introduce CheXficient, a chest X-ray (CXR) foundation model that selectively prioritizes informative training samples. CheXficient is pretrained on only 22.7% of 1,235,004 paired CXR images and reports while consuming under 27.3% of the total compute budget, yet achieving comparable or superior performance to its full-data counterpart and other large-scale pretrained models. We assess CheXficient across 20 individual benchmarks spanning 5 task types, including non-adapted off-the-shelf evaluations (zero-shot findings classification and crossmodal retrieval) and adapted downstream tasks (disease prediction, semantic segmentation, and radiology report generation). Further analyses show that CheXficient systematically prioritizes under-represented training samples, improving generalizability on long-tailed or rare conditions. Overall, our work offers practical insights into the data and computation demands for efficient pretraining and downstream adaptation of medical vision-language foundation models.
Problem

Research questions and friction points this paper is trying to address.

medical imaging
data redundancy
class imbalance
computational inefficiency
foundation models
Innovation

Methods, ideas, or system contributions that make the work stand out.

data-efficient pretraining
active data curation
medical foundation model
chest X-ray
compute efficiency
🔎 Similar Papers
No similar papers found.
Chong Wang
Chong Wang
Stanford University
Trustworthy AIDeep LearningMedical Image Analysis
Y
Yabin Zhang
Stanford Center for Artificial Intelligence in Medicine and Imaging, Stanford University, Palo Alto, CA, USA.; Department of Radiology, Stanford University, Stanford, CA, USA.
Yunhe Gao
Yunhe Gao
Stanford University, Rutgers University
Computer VisionMachine LearningMedical Imaging AnalysisVision-Language Model
Maya Varma
Maya Varma
Stanford University
Computer Science
C
Clemence Mottez
Stanford Center for Artificial Intelligence in Medicine and Imaging, Stanford University, Palo Alto, CA, USA.; Department of Radiology, Stanford University, Stanford, CA, USA.
F
Faidra Patsatzi
Stanford Center for Artificial Intelligence in Medicine and Imaging, Stanford University, Palo Alto, CA, USA.; Department of Radiology, Stanford University, Stanford, CA, USA.
Jiaming Liu
Jiaming Liu
Postdoc@Stanford, PhD@WUSTL
OptimizationComputational ImagingDeep Learning
J
Jin Long
Stanford Center for Artificial Intelligence in Medicine and Imaging, Stanford University, Palo Alto, CA, USA.; Department of Pediatrics, Stanford University, Stanford, CA, USA.
Jean-Benoit Delbrouck
Jean-Benoit Delbrouck
Hugging Face, Stanford
Sergios Gatidis
Sergios Gatidis
Stanford Medicine
Healthcare AIMedical Image and Data AnalysisPediatric RadiologyHybrid Imaging
A
Akshay S. Chaudhari
Stanford Center for Artificial Intelligence in Medicine and Imaging, Stanford University, Palo Alto, CA, USA.; Department of Radiology, Stanford University, Stanford, CA, USA.; Department of Biomedical Data Science, Stanford University, Stanford, CA, USA.
Curtis P. Langlotz
Curtis P. Langlotz
Professor of Radiology, Medicine, and Biomedical Data Science, Stanford University
machine learningcomputer visionnatural language processingdecision support systemstechnology assessment