π€ AI Summary
This work addresses the limitations of existing radiology foundation models, which heavily rely on large-scale data and computational resources, thereby struggling to meet clinical demands for both accuracy and efficiency. To overcome this, the authors propose GreenRFMβa resource-efficient pretraining framework that departs from the "scale-at-all-costs" paradigm and instead adopts the MUST supervision principle (More refined, Universal, Semantically enhanced, Task-aligned), substantially reducing computational overhead. GreenRFM supports both lightweight (6GB GPU memory) and high-performance (24GB GPU memory) configurations and is applicable across multimodal medical imaging modalities, including CT and MRI. Validated on over 200,000 images from four institutions, it achieves state-of-the-art performance in thoracoabdominal CT and musculoskeletal MRI tasks, with single-GPU training completing in as little as four hours.
π Abstract
The development of radiology foundation models (RFMs) is hindered by a reliance on brute-force scaling. Existing approaches often directly translate methods for natural images, which prioritize scale over precision and hence lead to brittle and expensive models in clinical practice. To address this, we present a resource-efficient pre-training framework, GreenRFM, that achieves state-of-the-art performance. Our framework ensures robust generalization across diverse patient populations and imaging protocols, reducing computational requirements by orders of magnitude while surpassing complex, parameter-heavy models. These capabilities stem from principled supervision design that aims to maximally utilize supervisory signals via More distilled, Ubiquitous, Semantic-enforcing, and Task-aligning (MUST) supervision, rather than simply piling up the quantity of training data. We offer two GreenRFM configurations: (i) a performant model that establishes a new state-of-the-art using a single 24GB GPU within 24 hours, and (ii) a lightweight model that matches existing benchmarks with 6GB VRAM in 4 hours. We conduct extensive experiments using over 200,000 images from four institutions and of two modalities. GreenRFMs achieve superior performances on chest and abdominal CT datasets, regardless of public or private benchmark, surpassing a range of baseline models. In addition, the results on internal musculoskeletal MRI images show that the same supervision principles transfer between different modalities. Our performance and efficiency challenge the ``scale is all you need''dogma and democratize the equitable development of state-of-the-art RFMs for clinicians even on a laptop.