🤖 AI Summary
This work addresses the challenge of achieving high-precision 6D pose estimation for textureless surgical instruments under conditions of frequent occlusion, missing depth information, and scarce annotated data. To this end, we propose SurfSurg6D—the first geometry-consistent dense correspondence framework tailored for surgical instruments—that enables accurate pose estimation from RGB images alone. We introduce SynSurg6D, a synthetic dataset that mitigates the scarcity of real-world annotations and broadens pose distribution coverage. By integrating dense correspondence learning with geometric consistency constraints, our method substantially outperforms existing RGB-only approaches on the SurgRIPE, EndoVis2018, and SurgPose benchmarks, significantly improving both accuracy and robustness in pose estimation.
📝 Abstract
Surgical instrument pose estimation provides crucial information for promising applications, including autonomous robotic surgery, skill assessment, and standardization of surgical workflow. However, this task remains highly challenging due to high precision requirements, frequent occlusions, textureless instruments, scarcity of depth information and very limited annotated data. These constraints often lead to unsatisfactory performance when employing general object pose estimation approaches to surgical scenarios. To address these issues, we first construct a new dataset SynSurg6D, to alleviate the data shortage in this task. We further propose SurfSurg6D, a dense-correspondence framework tailored for surgical instrument pose estimation. Experimental results on the SurgRIPE, EndoVis2018 and SurgPose datasets demonstrate that the introduction of our generated dataset SynSurg6D is able to diversify the pose distributions, thus enhancing the performance of existing approaches. Furthermore, SurfSurg6D outperforms existing methods, providing a robust solution for precise and efficient RGB-only pose estimation.