π€ AI Summary
This work proposes a follower-type flexible continuum endoscopic robot to address the challenge of frequent collisions with lumen walls during autonomous navigation in confined tubular environments. By integrating monocular depth estimation with deep reinforcement learning, the system leverages synthetic data generated via NVIDIA Replicator in a high-fidelity intestinal simulation environment built on NVIDIA Omniverse. The Depth Anything model is fine-tuned using this synthetic data to enhance 3D perception accuracy, while a geometry-aware reward mechanism is designed to enable precise lumen tracking. To the best of our knowledge, this study presents the first integration of synthetic dataβdriven depth estimation with reinforcement learning for endoscopic navigation, achieving a 39.2% improvement in Ξ΄β depth accuracy over the original model and reducing the navigation J-index by 0.67 compared to the next-best method, thereby significantly enhancing obstacle avoidance capability and system robustness.
π Abstract
Autonomous navigation is crucial for both medical and industrial endoscopic robots, enabling safe and efficient exploration of narrow tubular environments without continuous human intervention, where avoiding contact with the inner walls has been a longstanding challenge for prior approaches. We present a follow-the-leader endoscopic robot based on a flexible continuum structure designed to minimize contact between the endoscope body and intestinal walls, thereby reducing patient discomfort. To achieve this objective, we propose a vision-based deep reinforcement learning framework guided by monocular depth estimation. A realistic intestinal simulation environment was constructed in \textit{NVIDIA Omniverse} to train and evaluate autonomous navigation strategies. Furthermore, thousands of synthetic intraluminal images were generated using NVIDIA Replicator to fine-tune the Depth Anything model, enabling dense three-dimensional perception of the intestinal environment with a single monocular camera. Subsequently, we introduce a geometry-aware reward and penalty mechanism to enable accurate lumen tracking. Compared with the original Depth Anything model, our method improves $\delta_{1}$ depth accuracy by 39.2% and reduces the navigation J-index by 0.67 relative to the second-best method, demonstrating the robustness and effectiveness of the proposed approach.