Towards Real-Time Autonomous Navigation: Transformer-Based Catheter Tip Tracking in Fluoroscopy

📅 2026-05-13
📈 Citations: 0
Influential: 0
📄 PDF

career value

162K/year
🤖 AI Summary
This study addresses the challenge of real-time catheter tip tracking in mechanical thrombectomy, where low-contrast, noisy X-ray fluoroscopy images and instrument occlusions hinder accurate localization. To overcome this, the authors propose a multi-threaded real-time tracking pipeline that, for the first time, integrates Transformer-based architectures—specifically SegFormer—into the catheter tip segmentation task. The approach is enhanced by a two-stage post-processing strategy comprising component filtering, single-pixel skeletonization, and arc-length-based greedy path tracing, enabling high-precision localization in complex clinical scenarios. Experimental results demonstrate that the binary-class SegFormer model achieves a mean absolute error of 4.44 mm on manually annotated data and improves the Dice score by 5% over the CathAction baseline, significantly outperforming existing methods and effectively supporting reinforcement learning–driven autonomous navigation systems.
📝 Abstract
Purpose: Mechanical thrombectomy (MT) improves stroke outcomes, but is limited by a lack of local treatment access. Widespread distribution of reinforcement learning (RL)-based robotic systems can be used to alleviate this challenge through autonomous navigation, but current RL methods require live device tip coordinate tracking to function. This paper aims to develop and evaluate a real-time catheter tip tracking pipeline under fluoroscopy, addressing challenges such as low contrast, noise, and device occlusion. Methods: A multi-threaded pipeline was designed, incorporating frame reading, preprocessing, inference, and post-processing. Deep learning segmentation models, including U-Net, U-Net+Transformer, and SegFormer, were trained and benchmarked using two-class and three-class formulations. Post-processing involved two-step component filtering, one-pixel medial skeletonization, and greedy arc-length path following with contour fall-back. Results: On manually-labeled moderate complexity fluoroscopic video data, the two-class SegFormer achieved a mean absolute error of 4.44 mm, outperforming U-Net (4.60 mm), U-Net+Transformer (6.20 mm) and all three-class models (5.19-7.74 mm). On segmentation benchmarks, the system exceeded state-of-the-art CathAction results with improvements of up to +5% in Dice scores for three-segmentation. Conclusion: The results demonstrate that the proposed multi-threaded tracking framework maintains stable performance under challenging imaging conditions, outperforming prior benchmarks, while providing a reliable and efficient foundation for RL-based autonomous MT navigation.
Problem

Research questions and friction points this paper is trying to address.

catheter tip tracking
fluoroscopy
autonomous navigation
real-time tracking
mechanical thrombectomy
Innovation

Methods, ideas, or system contributions that make the work stand out.

Transformer-based segmentation
real-time catheter tracking
fluoroscopy
autonomous navigation
multi-threaded pipeline
🔎 Similar Papers
No similar papers found.
H
Harry Robertshaw
Surgical & Interventional Engineering, School of Biomedical Engineering & Imaging Sciences, Kings College London, London
Y
Yanghe Hao
Surgical & Interventional Engineering, School of Biomedical Engineering & Imaging Sciences, Kings College London, London
W
Weiyuan Deng
Surgical & Interventional Engineering, School of Biomedical Engineering & Imaging Sciences, Kings College London, London
B
Benjamin Jackson
Surgical & Interventional Engineering, School of Biomedical Engineering & Imaging Sciences, Kings College London, London
S
S. M. Hadi Sadati
School of Engineering & Materials Science, Queen Mary London, London
N
Nikola Fischer
Surgical & Interventional Engineering, School of Biomedical Engineering & Imaging Sciences, Kings College London, London
Tom Vercauteren
Tom Vercauteren
Professor of Interventional Image Computing, King's College London
Medical Image ComputingImage RegistrationComputer-assisted InterventionsEndomicroscopyImage-guided Interventions
Alejandro Granados
Alejandro Granados
KCL
Surgical Data ScienceGenerative ModelsCausal AI
T
Thomas C. Booth
Surgical & Interventional Engineering, School of Biomedical Engineering & Imaging Sciences, Kings College London, London