🤖 AI Summary
This work proposes SSPFormer, a self-supervised representation learning method for medical imaging that addresses the challenges of adapting pretrained Transformers to the anatomical specificity of MRI, as well as the scarcity and privacy constraints of medical data. SSPFormer innovatively integrates inverse frequency-domain projection masking—which prioritizes reconstruction of high-frequency anatomical regions—with physiologically plausible frequency-weighted FFT noise augmentation. This enables structure-aware and artifact-robust feature learning directly from unlabeled raw MRI data. Built upon a Transformer architecture, SSPFormer achieves state-of-the-art performance across multiple tasks, including segmentation, super-resolution, and denoising, significantly enhancing MRI detail fidelity and demonstrating strong clinical applicability.
📝 Abstract
The pre-trained transformer demonstrates remarkable generalization ability in natural image processing. However, directly transferring it to magnetic resonance images faces two key challenges: the inability to adapt to the specificity of medical anatomical structures and the limitations brought about by the privacy and scarcity of medical data. To address these issues, this paper proposes a Self-Supervised Pretrained Transformer (SSPFormer) for MRI images, which effectively learns domain-specific feature representations of medical images by leveraging unlabeled raw imaging data. To tackle the domain gap and data scarcity, we introduce inverse frequency projection masking, which prioritizes the reconstruction of high-frequency anatomical regions to enforce structure-aware representation learning. Simultaneously, to enhance robustness against real-world MRI artifacts, we employ frequency-weighted FFT noise enhancement that injects physiologically realistic noise into the Fourier domain. Together, these strategies enable the model to learn domain-invariant and artifact-robust features directly from raw scans. Through extensive experiments on segmentation, super-resolution, and denoising tasks, the proposed SSPFormer achieves state-of-the-art performance, fully verifying its ability to capture fine-grained MRI image fidelity and adapt to clinical application requirements.