🤖 AI Summary
This work proposes a domain-specific self-supervised pretraining method for 3D medical images (MRI/CT) to address intensity mismatches arising from cross-device variations and complex anatomical deformations in medical image registration. Building upon the DINO framework, the approach learns semantically rich, dense voxel-level features without requiring labeled data, thereby effectively supporting deformable registration. Evaluated on cross-patient abdominal MRI/CT registration tasks, the method significantly outperforms DINOv2 models pretrained on natural images, achieving lower inference overhead and superior performance in out-of-domain evaluations. The proposed technique enhances both robustness and efficiency in cross-modal and cross-patient registration scenarios.
📝 Abstract
Medical image registration is a critical component of clinical imaging workflows, enabling accurate longitudinal assessment, multi-modal data fusion, and image-guided interventions. Intensity-based approaches often struggle with interscanner variability and complex anatomical deformations, whereas feature-based methods offer improved robustness by leveraging semantically informed representations. In this work, we investigate DINO-style self-supervised pretraining directly on 3D medical imaging data, aiming to learn dense volumetric features well suited for deformable registration. We assess the resulting representations on challenging interpatient abdominal registration task across both MRI and CT modalities. Our domain-specialized pretraining outperforms the DINOv2 model trained on a large-scale collection of natural images, while requiring substantially lower computational resources at inference time. Moreover, it surpasses established registration models under out-of-domain evaluation, demonstrating the value of task-agnostic yet medical imaging-focused pretraining for robust and efficient 3D image registration.