ModeTv2: GPU-accelerated Motion Decomposition Transformer for Pairwise Optimization in Medical Image Registration

📅 2024-03-25
🏛️ arXiv.org
📈 Citations: 2
Influential: 0
📄 PDF
🤖 AI Summary
Medical image registration faces dual challenges: conventional methods suffer from low computational efficiency, while deep learning approaches often lack accuracy and generalizability. To address these issues, we propose Motion Decomposition Transformer v2 (ModeTv2), a pyramid network that—uniquely—integrates the rigorous pairwise optimization (PO) paradigm into a Transformer architecture. We further design a CUDA-accelerated ModeTv2 operator and a lightweight RegHead module to enable differentiable deformable field modeling and post-refinement. Our method combines a CNN-Transformer hybrid backbone with PO-based training. Evaluated on three brain MRI and one abdominal CT datasets, ModeTv2 achieves state-of-the-art performance: +3.2% Dice score, +1.8% Dice Similarity Coefficient (DSC), and 200× faster inference than conventional methods. It simultaneously delivers high accuracy, computational efficiency, strong interpretability, and clinical applicability. The source code is publicly available.

Technology Category

Application Category

📝 Abstract
Deformable image registration plays a crucial role in medical imaging, aiding in disease diagnosis and image-guided interventions. Traditional iterative methods are slow, while deep learning (DL) accelerates solutions but faces usability and precision challenges. This study introduces a pyramid network with the enhanced motion decomposition Transformer (ModeTv2) operator, showcasing superior pairwise optimization (PO) akin to traditional methods. We re-implement ModeT operator with CUDA extensions to enhance its computational efficiency. We further propose RegHead module which refines deformation fields, improves the realism of deformation and reduces parameters. By adopting the PO, the proposed network balances accuracy, efficiency, and generalizability. Extensive experiments on three public brain MRI datasets and one abdominal CT dataset demonstrate the network's suitability for PO, providing a DL model with enhanced usability and interpretability. The code is publicly available at https://github.com/ZAX130/ModeTv2.
Problem

Research questions and friction points this paper is trying to address.

Enhances medical image registration accuracy and efficiency
Addresses usability and precision challenges in deep learning
Improves computational efficiency with GPU-accelerated Transformer
Innovation

Methods, ideas, or system contributions that make the work stand out.

GPU-accelerated motion decomposition Transformer
CUDA extensions for computational efficiency
RegHead module refines deformation fields
🔎 Similar Papers
No similar papers found.
H
Hai-qun Wang
School of Biomedical Engineering, Shenzhen University Medical School, Shenzhen University, Shenzhen, China
Z
Zhuoyuan Wang
School of Biomedical Engineering, Shenzhen University Medical School, Shenzhen University, Shenzhen, China
D
Dong Ni
School of Biomedical Engineering, Shenzhen University Medical School, Shenzhen University, Shenzhen, China
Y
Yi Wang
School of Biomedical Engineering, Shenzhen University Medical School, Shenzhen University, Shenzhen, China