A Versatile Foundation Model for AI-enabled Mammogram Interpretation

📅 2025-09-24
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Current breast X-ray foundation models suffer from insufficient data diversity, poor generalizability, and inadequate clinical evaluation. To address these limitations, we propose a two-stage pretraining paradigm integrating self-supervised learning with clinical knowledge distillation, yielding the first general-purpose foundation model specifically designed for mammographic imaging. Trained on a large-scale, multicenter dataset, the model supports diverse downstream tasks—including lesion detection, segmentation, classification, image retrieval, and visual question answering—within a unified framework. We further introduce a comprehensive benchmark comprising 92 clinically relevant tasks. Extensive experiments demonstrate superior performance: the model achieves first place on 50 of 68 internal validation tasks (mean rank = 1.5) and on 20 of 24 external validation tasks (mean rank = 1.2), significantly outperforming existing methods. This work advances automated, clinically translatable early breast cancer screening.

Technology Category

Application Category

📝 Abstract
Breast cancer is the most commonly diagnosed cancer and the leading cause of cancer-related mortality in women globally. Mammography is essential for the early detection and diagnosis of breast lesions. Despite recent progress in foundation models (FMs) for mammogram analysis, their clinical translation remains constrained by several fundamental limitations, including insufficient diversity in training data, limited model generalizability, and a lack of comprehensive evaluation across clinically relevant tasks. Here, we introduce VersaMammo, a versatile foundation model for mammograms, designed to overcome these limitations. We curated the largest multi-institutional mammogram dataset to date, comprising 706,239 images from 21 sources. To improve generalization, we propose a two-stage pre-training strategy to develop VersaMammo, a mammogram foundation model. First, a teacher model is trained via self-supervised learning to extract transferable features from unlabeled mammograms. Then, supervised learning combined with knowledge distillation transfers both features and clinical knowledge into VersaMammo. To ensure a comprehensive evaluation, we established a benchmark comprising 92 specific tasks, including 68 internal tasks and 24 external validation tasks, spanning 5 major clinical task categories: lesion detection, segmentation, classification, image retrieval, and visual question answering. VersaMammo achieves state-of-the-art performance, ranking first in 50 out of 68 specific internal tasks and 20 out of 24 external validation tasks, with average ranks of 1.5 and 1.2, respectively. These results demonstrate its superior generalization and clinical utility, offering a substantial advancement toward reliable and scalable breast cancer screening and diagnosis.
Problem

Research questions and friction points this paper is trying to address.

Developing a versatile foundation model for mammogram interpretation to overcome limitations in existing AI systems
Addressing insufficient training data diversity and limited model generalizability in mammogram analysis
Establishing comprehensive evaluation across multiple clinically relevant breast cancer detection tasks
Innovation

Methods, ideas, or system contributions that make the work stand out.

Two-stage pre-training with self-supervised and supervised learning
Knowledge distillation to transfer features and clinical knowledge
Comprehensive benchmark with 92 tasks across five clinical categories
🔎 Similar Papers
No similar papers found.
Fuxiang Huang
Fuxiang Huang
The Hong Kong University of Science and Technology (HKUST)
Multimodal LearningFoundation model for Vertical DomainDomain Adaptation
Jiayi Zhu
Jiayi Zhu
Ph.D student, state key laboratory of cognitive neuroscience and learning, Beijing Normal University
Cognitive neuroscienceNeuroimagingDeep learning
Y
Yunfang Yu
Guangdong Provincial Key Laboratory of Malignant Tumor Epigenetics and Gene Regulation, Guangdong-Hong Kong Joint Laboratory for RNA Medicine, Department of Medical Oncology, Breast Tumor Centre, Phase I Clinical Trial Centre, Sun Yat-sen Memorial Hospital, Sun Yat-sen University, Guangzhou, China.
Y
Yu Xie
Department of Radiology, The Third Affiliated Hospital of Kunming Medical University, Yunnan Cancer Hospital, Peking University Cancer Hospital Yunnan, Kunming, Yunnan, China.
Y
Yuan Guo
Department of Radiology, Guangzhou First People’s Hospital, South China University of Technology, Guangzhou, Guangdong, China.
Q
Qingcong Kong
Department of Radiology, The Third Affiliated Hospital, Sun Yat-Sen University, Guangzhou, Guangdong, China.
M
Mingxiang Wu
Department of Radiology, Shenzhen People’s Hospital, Shenzhen, Guangdong, China.
Xinrui Jiang
Xinrui Jiang
The Hong Kong University of Science and Technology (HKUST)
Computer visionMedical image analysis
S
Shu Yang
Department of Computer Science and Engineering, The Hong Kong University of Science and Technology, Hong Kong, China.
J
Jiabo Ma
Department of Computer Science and Engineering, The Hong Kong University of Science and Technology, Hong Kong, China.
Z
Ziyi Liu
Department of Computer Science and Engineering, The Hong Kong University of Science and Technology, Hong Kong, China.
Z
Zhe Xu
Department of Computer Science and Engineering, The Hong Kong University of Science and Technology, Hong Kong, China.
Z
Zhixuan Chen
Department of Computer Science and Engineering, The Hong Kong University of Science and Technology, Hong Kong, China.
Y
Yujie Tan
Guangdong Provincial Key Laboratory of Malignant Tumor Epigenetics and Gene Regulation, Guangdong-Hong Kong Joint Laboratory for RNA Medicine, Department of Medical Oncology, Breast Tumor Centre, Phase I Clinical Trial Centre, Sun Yat-sen Memorial Hospital, Sun Yat-sen University, Guangzhou, China.
Zifan He
Zifan He
University of California - Los Angeles
FPGAHPCMachine Learning
L
Luhui Mao
Guangdong Provincial Key Laboratory of Malignant Tumor Epigenetics and Gene Regulation, Guangdong-Hong Kong Joint Laboratory for RNA Medicine, Department of Medical Oncology, Breast Tumor Centre, Phase I Clinical Trial Centre, Sun Yat-sen Memorial Hospital, Sun Yat-sen University, Guangzhou, China.
X
Xi Wang
Department of Computer Science and Engineering, The Hong Kong University of Science and Technology, Hong Kong, China.
Junlin Hou
Junlin Hou
HKUST | Fudan University
Computer VisionMedical Image AnalysisLabel-efficient Deep LearningeXplainable AI
L
Lei Zhang
Data Science and Analytics Thrust, The Hong Kong University of Science and Technology (Guangzhou), Guangzhou, Guangdong, China.
Qiong Luo
Qiong Luo
HKUST
Database SystemsParallel and Distributed SystemsData Management for e-Sciences
Zhenhui Li
Zhenhui Li
the Third Affiliated Hospital of Kunming Medical University, Yunnan Cancer Hospital, Yunnan Cancer
radiomicspathomicscolorectal cancer
H
Herui Yao
Guangdong Provincial Key Laboratory of Malignant Tumor Epigenetics and Gene Regulation, Guangdong-Hong Kong Joint Laboratory for RNA Medicine, Department of Medical Oncology, Breast Tumor Centre, Phase I Clinical Trial Centre, Sun Yat-sen Memorial Hospital, Sun Yat-sen University, Guangzhou, China.
H
Hao Chen
Department of Computer Science and Engineering, The Hong Kong University of Science and Technology, Hong Kong, China.