ModeTv2: GPU-accelerated Motion Decomposition Transformer for Pairwise Optimization in Medical Image Registration

📅 2024-03-25

🏛️ arXiv.org

📈 Citations: 2

✨ Influential: 0

career value

206K/year

🤖 AI Summary

Medical image registration faces dual challenges: conventional methods suffer from low computational efficiency, while deep learning approaches often lack accuracy and generalizability. To address these issues, we propose Motion Decomposition Transformer v2 (ModeTv2), a pyramid network that—uniquely—integrates the rigorous pairwise optimization (PO) paradigm into a Transformer architecture. We further design a CUDA-accelerated ModeTv2 operator and a lightweight RegHead module to enable differentiable deformable field modeling and post-refinement. Our method combines a CNN-Transformer hybrid backbone with PO-based training. Evaluated on three brain MRI and one abdominal CT datasets, ModeTv2 achieves state-of-the-art performance: +3.2% Dice score, +1.8% Dice Similarity Coefficient (DSC), and 200× faster inference than conventional methods. It simultaneously delivers high accuracy, computational efficiency, strong interpretability, and clinical applicability. The source code is publicly available.

Technology Category

Application Category

📝 Abstract

Deformable image registration plays a crucial role in medical imaging, aiding in disease diagnosis and image-guided interventions. Traditional iterative methods are slow, while deep learning (DL) accelerates solutions but faces usability and precision challenges. This study introduces a pyramid network with the enhanced motion decomposition Transformer (ModeTv2) operator, showcasing superior pairwise optimization (PO) akin to traditional methods. We re-implement ModeT operator with CUDA extensions to enhance its computational efficiency. We further propose RegHead module which refines deformation fields, improves the realism of deformation and reduces parameters. By adopting the PO, the proposed network balances accuracy, efficiency, and generalizability. Extensive experiments on three public brain MRI datasets and one abdominal CT dataset demonstrate the network's suitability for PO, providing a DL model with enhanced usability and interpretability. The code is publicly available at https://github.com/ZAX130/ModeTv2.

Problem

Research questions and friction points this paper is trying to address.

Enhances medical image registration accuracy and efficiency

Addresses usability and precision challenges in deep learning

Improves computational efficiency with GPU-accelerated Transformer

Innovation

Methods, ideas, or system contributions that make the work stand out.

GPU-accelerated motion decomposition Transformer

CUDA extensions for computational efficiency

RegHead module refines deformation fields

🔎 Similar Papers

No similar papers found.