๐ค AI Summary
This study addresses the poor generalization and low efficiency of existing wireless capsule endoscopy in gastric navigation, primarily caused by substantial inter-patient anatomical variations that lead to incomplete mucosal coverage. To overcome these limitations, the authors propose a novel anatomy landmarkโguided deep reinforcement learning framework that replaces high-dimensional visual inputs with low-dimensional anatomical landmarks. The approach integrates a lightweight edge-contour-depth fusion module with an adaptive dynamic programming controller and employs a two-stage simulation-to-reality transfer strategy. Evaluated on simulations from eight patients, the method achieves over 97% mucosal coverage within 50 seconds. In ex vivo experiments, it attains an average coverage of 87%, reducing examination time by 53% compared to expert manual operation, thereby significantly enhancing cross-patient generalization and robustness.
๐ Abstract
Wireless capsule endoscopy (WCE) enables painless visualization of the gastrointestinal tract, but its diagnostic potential is limited by incomplete mucosal coverage and poor transferability of existing navigation methods across patient anatomies. We propose a transferable, anatomical landmarkguided deep reinforcement learning (AL-DRL) framework for autonomous gastric navigation. Leveraging a lightweight edgecontour-depth fusion module, our policy operates on stable, lowdimensional landmark coordinates rather than high-dimensional video streams, effectively bridging the sim-to-real gap. In simulations across eight patient-derived models, the method achieves over 97% coverage within 50 seconds, significantly outperforming vanilla PPO, SAC, and DQN agents. A two-stage sim-to-real pipeline with an adaptive dynamic programming controller actively mitigates physical disturbances. Ex-vivo experiments demonstrate a mean coverage of 87% and a 53% reduction in procedure time compared with expert manual control.