🤖 AI Summary
This paper addresses the challenging problem of unsupervised intraoperative multimodal (MRI/CT/US) 3D medical image registration, where large inter-modal intensity discrepancies and complex soft-tissue deformations preclude reliable manual annotations. To this end, we propose an unsupervised multi-level correlation balancing optimization framework. Our key contributions are: (1) a modality-agnostic neighborhood descriptor enabling robust cross-modal feature extraction; (2) a multi-level pyramid fusion scheme coupled with dense correlation modeling to jointly optimize global and local deformation fields; and (3) a weight-balanced coupled convex optimization strategy that enhances both consistency and accuracy of the estimated deformation field. Evaluated on two benchmark challenges—Learn2Reg 2024 ReMIND2Reg and COMULIS3DCLEM—the framework achieves second place on both leaderboards. The implementation is publicly available.
📝 Abstract
Surgical navigation based on multimodal image registration has played a significant role in providing intraoperative guidance to surgeons by showing the relative position of the target area to critical anatomical structures during surgery. However, due to the differences between multimodal images and intraoperative image deformation caused by tissue displacement and removal during surgery, effective registration of preoperative and intraoperative multimodal images faces significant challenges. To address the multimodal image registration challenges in Learn2Reg 2024, an unsupervised multimodal medical image registration method based on multilevel correlation balanced optimization (MCBO) is designed to solve these problems. First, the features of each modality are extracted based on the modality independent neighborhood descriptor, and the multimodal images are mapped to the feature space. Second, a multilevel pyramidal fusion optimization mechanism is designed to achieve global optimization and local detail complementation of the deformation field through dense correlation analysis and weight-balanced coupled convex optimization for input features at different scales. For preoperative medical images in different modalities, the alignment and stacking of valid information between different modalities is achieved by the maximum fusion between deformation fields. Our method focuses on the ReMIND2Reg task in Learn2Reg 2024, and to verify the generality of the method, we also tested it on the COMULIS3DCLEM task. Based on the results, our method achieved second place in the validation of both two tasks. The code is available at https://github.com/wjiazheng/MCBO.