Mono-Modalizing Extremely Heterogeneous Multi-Modal Medical Image Registration

📅 2025-06-18
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Deformable registration between highly heterogeneous modalities—e.g., PET/FA and MRI/CT—is challenging for conventional unsupervised methods, as cross-modality similarity metrics (e.g., NCC, MI) fail, leading to severe deformation artifacts. Method: We propose M2M-Reg, the first “multi-to-single” supervised paradigm: a multi-modal registration model is trained exclusively using single-modality similarity losses (e.g., NCC or MI on MRI–MRI pairs), eliminating the need for handcrafted cross-modality metrics, ground-truth deformation fields, or segmentation labels. To ensure diffeomorphism, we introduce GradCyCon—a novel gradient-based cyclic consistency regularizer—and integrate it with a differentiable cycle-consistent registration network and a diffeomorphic flow generation architecture. Results: On the ADNI dataset, M2M-Reg achieves state-of-the-art Dice scores for PET–MRI and FA–MRI registration, improving over prior methods by up to 2×, while markedly suppressing image distortion and preserving anatomical plausibility.

Technology Category

Application Category

📝 Abstract
In clinical practice, imaging modalities with functional characteristics, such as positron emission tomography (PET) and fractional anisotropy (FA), are often aligned with a structural reference (e.g., MRI, CT) for accurate interpretation or group analysis, necessitating multi-modal deformable image registration (DIR). However, due to the extreme heterogeneity of these modalities compared to standard structural scans, conventional unsupervised DIR methods struggle to learn reliable spatial mappings and often distort images. We find that the similarity metrics guiding these models fail to capture alignment between highly disparate modalities. To address this, we propose M2M-Reg (Multi-to-Mono Registration), a novel framework that trains multi-modal DIR models using only mono-modal similarity while preserving the established architectural paradigm for seamless integration into existing models. We also introduce GradCyCon, a regularizer that leverages M2M-Reg's cyclic training scheme to promote diffeomorphism. Furthermore, our framework naturally extends to a semi-supervised setting, integrating pre-aligned and unaligned pairs only, without requiring ground-truth transformations or segmentation masks. Experiments on the Alzheimer's Disease Neuroimaging Initiative (ADNI) dataset demonstrate that M2M-Reg achieves up to 2x higher DSC than prior methods for PET-MRI and FA-MRI registration, highlighting its effectiveness in handling highly heterogeneous multi-modal DIR. Our code is available at https://github.com/MICV-yonsei/M2M-Reg.
Problem

Research questions and friction points this paper is trying to address.

Aligning highly heterogeneous medical images (e.g., PET, FA) with structural references (e.g., MRI, CT)
Overcoming unreliable spatial mappings in multi-modal deformable image registration (DIR)
Improving alignment accuracy without ground-truth transformations or segmentation masks
Innovation

Methods, ideas, or system contributions that make the work stand out.

M2M-Reg trains multi-modal DIR using mono-modal similarity
GradCyCon regularizer enforces diffeomorphism via cyclic training
Extends to semi-supervised learning without ground-truth data
🔎 Similar Papers
No similar papers found.
K
K. Choo
Department of Computer Science, Yonsei University, Seoul, Republic of Korea
H
Hyunkyung Han
Department of Artificial Intelligence, Yonsei University, Seoul, Republic of Korea
Jinyeong Kim
Jinyeong Kim
Yonsei University, College of Medicine, Undergraduate Student
InterpretabilityMultimodalClinical AI
C
Chanyong Yoon
Department of Computer Science, Yonsei University, Seoul, Republic of Korea
Seong Jae Hwang
Seong Jae Hwang
Yonsei University
Machine LearningComputer VisionMedical Imaging