H-Net: A Multitask Architecture for Simultaneous 3D Force Estimation and Stereo Semantic Segmentation in Intracardiac Catheters

📅 2024-12-31
🏛️ IEEE Robotics and Automation Letters
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Current cardiac catheterization procedures lack a sensing system capable of simultaneously acquiring catheter 3D deformation and multi-directional contact forces, limiting intraoperative visuo-haptic coordination accuracy. This paper proposes, for the first time, an end-to-end joint modeling framework for catheter semantic segmentation and 3D force estimation from dual-view X-ray images. We design a lightweight multi-input multi-output encoder-decoder architecture featuring shared-weight dual-branch segmentation heads and a single force regression head. Crucially, we incorporate stereo geometric constraints to directly infer 3D contact forces from biplanar fluoroscopic projections. Our method achieves state-of-the-art performance on both catheter segmentation and 3D force prediction, with significant improvements in real-time visuo-haptic perception capability—particularly on resource-constrained interventional devices.

Technology Category

Application Category

📝 Abstract
The success rate of catheterization procedures is closely linked to the sensory data provided to the surgeon. Vision-based deep learning models can deliver both tactile and visual information in a sensor-free manner, while also being cost-effective to produce. Given the complexity of these models for devices with limited computational resources, research has focused on force estimation and catheter segmentation separately. However, there is a lack of a comprehensive architecture capable of simultaneously segmenting the catheter from two different angles and estimating the applied forces in 3D. To bridge this gap, this work proposes a novel, lightweight, multi-input, multi-output encoder-decoder-based architecture. It is designed to segment the catheter from two points of view and concurrently measure the applied forces in the <inline-formula><tex-math notation="LaTeX">$x$</tex-math></inline-formula>, <inline-formula><tex-math notation="LaTeX">$y$</tex-math></inline-formula>, and <inline-formula><tex-math notation="LaTeX">$z$</tex-math></inline-formula> directions. This network processes two simultaneous X-Ray images, intended to be fed by a biplane fluoroscopy system, showing a catheter's deflection from different angles. It uses two parallel sub-networks with shared parameters to output two segmentation maps corresponding to the inputs. Additionally, it leverages stereo vision to estimate the applied forces at the catheter's tip in 3D. The architecture features two input channels, two classification heads for segmentation, and a regression head for force estimation through a single end-to-end architecture. The output of all heads was assessed and compared with the literature, demonstrating state-of-the-art performance in both segmentation and force estimation. To the best of the authors' knowledge, this is the first time such a model has been proposed.
Problem

Research questions and friction points this paper is trying to address.

Catheterization
Shape Recognition
Force Measurement
Innovation

Methods, ideas, or system contributions that make the work stand out.

H-Net
Stereovision Technology
Deep Learning in Catheterization
🔎 Similar Papers
No similar papers found.