Long-Short Term Agents for Pure-Vision Bronchoscopy Robotic Autonomy

πŸ“… 2026-03-09
πŸ“ˆ Citations: 0
✨ Influential: 0
πŸ“„ PDF
πŸ€– AI Summary
This work proposes a purely vision-based autonomous navigation framework for robot-assisted bronchoscopy to address challenges such as limited field of view, dynamic artifacts, and reliance on external localization systems. By integrating preoperative CT scans with real-time endoscopic video, the approach leverages a hierarchical long–short-term agent architecture, a vision-based predictive world model, cross-modal alignment between CT and endoscopy, and low-latency motion control to enable long-range navigation without external sensors. In phantom, ex vivo porcine lung, and in vivo models, the system achieved navigation success rates of 100%, 80% (reaching up to the 8th-generation bronchi), and performance comparable to that of expert physicians, respectively. This study presents the first preclinical validation of sensor-free visual navigation for bronchoscopic interventions.

Technology Category

Application Category

πŸ“ Abstract
Accurate intraoperative navigation is essential for robot-assisted endoluminal intervention, but remains difficult because of limited endoscopic field of view and dynamic artifacts. Existing navigation platforms often rely on external localization technologies, such as electromagnetic tracking or shape sensing, which increase hardware complexity and remain vulnerable to intraoperative anatomical mismatch. We present a vision-only autonomy framework that performs long-horizon bronchoscopic navigation using preoperative CT-derived virtual targets and live endoscopic video, without external tracking during navigation. The framework uses hierarchical long-short agents: a short-term reactive agent for continuous low-latency motion control, and a long-term strategic agent for decision support at anatomically ambiguous points. When their recommendations conflict, a world-model critic predicts future visual states for candidate actions and selects the action whose predicted state best matches the target view. We evaluated the system in a high-fidelity airway phantom, three ex vivo porcine lungs, and a live porcine model. The system reached all planned segmental targets in the phantom, maintained 80\% success to the eighth generation ex vivo, and achieved in vivo navigation performance comparable to the expert bronchoscopist. These results support the preclinical feasibility of sensor-free autonomous bronchoscopic navigation.
Problem

Research questions and friction points this paper is trying to address.

bronchoscopic navigation
vision-only autonomy
intraoperative navigation
external tracking
anatomical mismatch
Innovation

Methods, ideas, or system contributions that make the work stand out.

vision-only navigation
hierarchical agents
world-model critic
autonomous bronchoscopy
preoperative CT integration
πŸ”Ž Similar Papers
No similar papers found.
J
Junyang Wu
Institute of Medical Robotics, Shanghai Jiao Tong University, Shanghai, 200240, China.
M
Mingyi Luo
Institute of Medical Robotics, Shanghai Jiao Tong University, Shanghai, 200240, China.
F
Fangfang Xie
Shanghai Chest Hospital, Shanghai, 10587, China.
M
Minghui Zhang
Institute of Medical Robotics, Shanghai Jiao Tong University, Shanghai, 200240, China.
Hanxiao Zhang
Hanxiao Zhang
Nanjing University
C
Chunxi Zhang
Shanghai Chest Hospital, Shanghai, 10587, China.
J
Junhao Wang
Institute of Medical Robotics, Shanghai Jiao Tong University, Shanghai, 200240, China.
J
Jiayuan Sun
Shanghai Chest Hospital, Shanghai, 10587, China.
Yun Gu
Yun Gu
Shanghai Jiao Tong University
Medical Image AnalysisComputer-Assisted Intervention
G
Guang-Zhong Yang
Shanghai Chest Hospital, Shanghai, 10587, China.