🤖 AI Summary
This work addresses the limited fine-grained lesion representation capability in breast cancer risk prediction. We propose a multi-scale masked image modeling (MIM) self-supervised learning framework specifically designed for breast imaging. Our method performs random block masking and reconstruction on high-resolution mammograms at multiple resolutions, jointly optimizing spatial details (e.g., edge textures) and semantic features—thereby overcoming the representational limitations of single-scale MIM. To the best of our knowledge, this is the first multi-scale MIM pretraining paradigm tailored for breast imaging. Evaluated on the CBIS-DDSM dataset, it achieves substantial improvements: +3% AP and +1% AUC for pathological classification; +4% AP and +2% AUC for lesion boundary classification—consistently outperforming existing state-of-the-art methods across all metrics.
📝 Abstract
Self-supervised learning (SSL) has garnered substantial interest within the machine learning and computer vision communities. Two prominent approaches in SSL include contrastive-based learning and self-distillation utilizing cropping augmentation. Lately, masked image modeling (MIM) has emerged as a more potent SSL technique, employing image inpainting as a pretext task. MIM creates a strong inductive bias toward meaningful spatial and semantic understanding. This has opened up new opportunities for SSL to contribute not only to classification tasks but also to more complex applications like object detection and image segmentation. Building upon this progress, our research paper introduces a scalable and practical SSL approach centered around more challenging pretext tasks that facilitate the acquisition of robust features. Specifically, we leverage multi-scale image reconstruction from randomly masked input images as the foundation for feature learning. Our hypothesis posits that reconstructing high-resolution images enables the model to attend to finer spatial details, particularly beneficial for discerning subtle intricacies within medical images. The proposed SSL features help improve classification performance on the Curated Breast Imaging Subset of Digital Database for Screening Mammography (CBIS-DDSM) dataset. In pathology classification, our method demonstrates a 3% increase in average precision (AP) and a 1% increase in the area under the receiver operating characteristic curve (AUC) when compared to state-of-the-art (SOTA) algorithms. Moreover, in mass margins classification, our approach achieves a 4% increase in AP and a 2% increase in AUC.