A Real-time Scale-robust Network for Glottis Segmentation in Nasal Transnasal Intubation

📅 2026-04-29

📈 Citations: 0

✨ Influential: 0

career value

200K/year

🤖 AI Summary

This study addresses the challenge of achieving high-precision real-time glottis segmentation during transnasal intubation, where large scale variations, complex anatomical structures, and poor illumination hinder existing visual algorithms. To overcome these limitations, the authors propose a lightweight glottis segmentation framework that incorporates a multi-receptive-field feature extraction module to reduce intra-class variation, redesigns the label assignment mechanism to enhance robustness against scale changes, and employs efficient convolutional operations to construct both backbone and neck networks. Evaluated on three datasets, the method achieves a mean Dice score of 92.9%, with a model size of only 19 MB and an inference speed exceeding 170 FPS, significantly outperforming current state-of-the-art approaches.

📝 Abstract

Nasotracheal intubation (NTI) is a critical clinical procedure for establishing and maintaining patient airway patency. Machine-assisted NTI has emerged as a pivotal approach for optimizing procedural efficiency and minimizing manual intervention. However, visual detection algorithms employed for NTI navigation encounter significant challenges, including complex anatomical environments and suboptimal illumination conditions surrounding the glottis. Additionally, the glottis presents considerable scale variability throughout the procedure, initially appearing as a small, difficult-to-capture structure before expanding to occupy nearly the entire field of view. Moreover, traditional visual detection methods often have high computational costs, making real-time, high-precision detection on portable devices challenging. To enhance NTI efficacy and address these challenges, this paper proposes a novel glottis segmentation framework optimized for vision-assisted NTI applications. First, we designed a lightweight, multi-receptive field feature extraction module to reduce intra-class differences, achieving robustness to scale variations of the glottis. This module was then stacked to form the backbone and neck of our network. Subsequently, we developed an advanced label assignment method and redefined the number of samples to further reduce intra-class differences and enhance accuracy in the complex NTI environment. Experiments on three distinct datasets demonstrate that our network surpasses state-of-the-art algorithms, achieving a segmentation mDice of 92.9\% with a compact model size of 19 MB and an inference speed exceeding 170 frames per second. % Our code and datasets will be open-sourced on GitHub after the manuscript is accepted. Our code and datasets are available at https://github.com/HBUT-CV/GlottisNet.

Problem

Research questions and friction points this paper is trying to address.

glottis segmentation

nasotracheal intubation

scale variation

real-time detection

visual navigation

Innovation

Methods, ideas, or system contributions that make the work stand out.

scale-robust segmentation

lightweight network

multi-receptive field