š¤ AI Summary
This work addresses the challenge of establishing stable and interpretable 2Dā3D geometric correspondences between preoperative 3D imaging and intraoperative 2D views in laparoscopic liver surgery, a key obstacle to accurate image registration. To this end, the authors propose Land-Reg, a framework that enables explainable cross-modal rigid and non-rigid registration by explicitly modeling 2Dā3D landmark correspondences grounded in latent features. Key innovations include a latent alignment module and an uncertainty-aware overlapping landmark detector, complemented by reprojection consistency constraints and local isometric regularization to constrain deformation. Evaluated on the P2ILF dataset, the method outperforms existing approaches in both rigid pose estimation and non-rigid deformation registration.
š Abstract
In laparoscopic liver surgery, augmented reality technology enhances intraoperative anatomical guidance by overlaying 3D liver models from preoperative CT/MRI onto laparoscopic 2D views. However, existing registration methods lack explicit modeling of reliable 2D-3D geometric correspondences supported by latent evidence, leading to limited interpretability and potentially unstable alignment in clinical scenarios. In this work, we introduce Land-Reg, a correspondence-driven deformable registration framework that explicitly learns latent-grounded 2D-3D landmark correspondences as an interpretable intermediate representation to bridge cross-modal alignment. For rigid registration, Land-Reg embraces a Cross-modal Latent Alignment module to map multi-modal features into a unified latent space. Further, an Uncertainty-enhanced Overlap Landmark Detector with similarity matching is proposed to robustly estimate explicit 2D-3D landmark correspondences. For non-rigid registration, we design a novel shape-constrained supervision strategy that anchors shape deformation to matched landmarks through reprojection consistency and incorporates local-isometric regularization to alleviate inherent 2D-3D depth ambiguity, while a rendered-mask alignment enforces global shape consistency. Experimental results on the P2ILF dataset demonstrate the superiority of our method on both rigid pose estimation and non-rigid deformation. Our code will be available at https://github.com/cuiruize/Land-Reg.