π€ AI Summary
In laparoscopic liver resection, existing preoperative-intraoperative registration methods rely on ambiguous anatomical landmarks and neglect intraoperative deformation modeling, leading to inaccurate spatial localization. This paper proposes a markerless, anatomy-agnostic 3Dβ3D registration framework that reformulates the conventional 3Dβ2D pipeline into a two-stage rigid + non-rigid optimization. First, a feature-decoupled Transformer is designed to learn robust cross-modal correspondences. Second, a structural regularization deformable network is introduced, incorporating a low-rank geometric similarity constraint to enhance surface consistency. Leveraging self-supervised learning and a newly constructed structured endoscopic video datasetβP2I-LReg (21 patients, 346 keyframes)βour method significantly outperforms state-of-the-art approaches on both synthetic and real-world data. A clinical user study validates its practical utility and improved localization accuracy.
π Abstract
Liver registration by overlaying preoperative 3D models onto intraoperative 2D frames can assist surgeons in perceiving the spatial anatomy of the liver clearly for a higher surgical success rate. Existing registration methods rely heavily on anatomical landmark-based workflows, which encounter two major limitations: 1) ambiguous landmark definitions fail to provide efficient markers for registration; 2) insufficient integration of intraoperative liver visual information in shape deformation modeling. To address these challenges, in this paper, we propose a landmark-free preoperative-to-intraoperative registration framework utilizing effective self-supervised learning, termed ourmodel. This framework transforms the conventional 3D-2D workflow into a 3D-3D registration pipeline, which is then decoupled into rigid and non-rigid registration subtasks. ourmodel~first introduces a feature-disentangled transformer to learn robust correspondences for recovering rigid transformations. Further, a structure-regularized deformation network is designed to adjust the preoperative model to align with the intraoperative liver surface. This network captures structural correlations through geometry similarity modeling in a low-rank transformer network. To facilitate the validation of the registration performance, we also construct an in-vivo registration dataset containing liver resection videos of 21 patients, called emph{P2I-LReg}, which contains 346 keyframes that provide a global view of the liver together with liver mask annotations and calibrated camera intrinsic parameters. Extensive experiments and user studies on both synthetic and in-vivo datasets demonstrate the superiority and potential clinical applicability of our method.