Real-Time Glottis Detection Framework via Spatial-decoupled Feature Learning for Nasal Transnasal Intubation

📅 2026-03-08

📈 Citations: 0

✨ Influential: 0

career value

195K/year

🤖 AI Summary

This work addresses the high computational cost and latency of existing machine-assisted glottis detection systems, which hinder their applicability in emergency nasotracheal intubation (NTI) scenarios demanding real-time performance and low resource consumption. To this end, the authors propose Mobile GlottisNet, a lightweight framework that integrates an adaptive feature decoupling module, a hierarchical dynamic thresholding strategy, and a cross-layer dynamic weighted fusion mechanism. The design further incorporates deformable convolutions and dynamic sample assignment to achieve robust and accurate glottis localization under complex anatomical conditions. With a model size of only 5 MB, Mobile GlottisNet achieves 62 FPS on-device inference on both PID and clinical datasets, and 33 FPS on edge platforms, effectively balancing accuracy and efficiency for deployment in resource-constrained emergency NTI settings.

Technology Category

Application Category

📝 Abstract

Nasotracheal intubation (NTI) is a vital procedure in emergency airway management, where rapid and accurate glottis detection is essential to ensure patient safety. However, existing machine assisted visual detection systems often rely on high performance computational resources and suffer from significant inference delays, which limits their applicability in time critical and resource constrained scenarios. To overcome these limitations, we propose Mobile GlottisNet, a lightweight and efficient glottis detection framework designed for real time inference on embedded and edge devices. The model incorporates structural awareness and spatial alignment mechanisms, enabling robust glottis localization under complex anatomical and visual conditions. We implement a hierarchical dynamic thresholding strategy to enhance sample assignment, and introduce an adaptive feature decoupling module based on deformable convolution to support dynamic spatial reconstruction. A cross layer dynamic weighting scheme further facilitates the fusion of semantic and detail features across multiple scales. Experimental results demonstrate that the model, with a size of only 5MB on both our PID dataset and Clinical datasets, achieves inference speeds of over 62 FPS on devices and 33 FPS on edge platforms, showing great potential in the application of emergency NTI.

Problem

Research questions and friction points this paper is trying to address.

glottis detection

nasotracheal intubation

real-time inference

resource-constrained environments

emergency airway management

Innovation

Methods, ideas, or system contributions that make the work stand out.

Mobile GlottisNet

spatial-decoupled feature learning

real-time glottis detection