Real-Time Glottis Detection Framework via Spatial-decoupled Feature Learning for Nasal Transnasal Intubation

📅 2026-03-08
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the high computational cost and latency of existing machine-assisted glottis detection systems, which hinder their applicability in emergency nasotracheal intubation (NTI) scenarios demanding real-time performance and low resource consumption. To this end, the authors propose Mobile GlottisNet, a lightweight framework that integrates an adaptive feature decoupling module, a hierarchical dynamic thresholding strategy, and a cross-layer dynamic weighted fusion mechanism. The design further incorporates deformable convolutions and dynamic sample assignment to achieve robust and accurate glottis localization under complex anatomical conditions. With a model size of only 5 MB, Mobile GlottisNet achieves 62 FPS on-device inference on both PID and clinical datasets, and 33 FPS on edge platforms, effectively balancing accuracy and efficiency for deployment in resource-constrained emergency NTI settings.

Technology Category

Application Category

📝 Abstract
Nasotracheal intubation (NTI) is a vital procedure in emergency airway management, where rapid and accurate glottis detection is essential to ensure patient safety. However, existing machine assisted visual detection systems often rely on high performance computational resources and suffer from significant inference delays, which limits their applicability in time critical and resource constrained scenarios. To overcome these limitations, we propose Mobile GlottisNet, a lightweight and efficient glottis detection framework designed for real time inference on embedded and edge devices. The model incorporates structural awareness and spatial alignment mechanisms, enabling robust glottis localization under complex anatomical and visual conditions. We implement a hierarchical dynamic thresholding strategy to enhance sample assignment, and introduce an adaptive feature decoupling module based on deformable convolution to support dynamic spatial reconstruction. A cross layer dynamic weighting scheme further facilitates the fusion of semantic and detail features across multiple scales. Experimental results demonstrate that the model, with a size of only 5MB on both our PID dataset and Clinical datasets, achieves inference speeds of over 62 FPS on devices and 33 FPS on edge platforms, showing great potential in the application of emergency NTI.
Problem

Research questions and friction points this paper is trying to address.

glottis detection
nasotracheal intubation
real-time inference
resource-constrained environments
emergency airway management
Innovation

Methods, ideas, or system contributions that make the work stand out.

Mobile GlottisNet
spatial-decoupled feature learning
real-time glottis detection
deformable convolution
edge inference
J
Jinyu Liu
Hubei Key Laboratory of Modern Manufacturing Quality Engineering, Hubei University of Technology, Nanhu Avenue 28, Wuhan, 430068, Hubei, China
G
Gaoyang Zhang
Hubei Key Laboratory of Modern Manufacturing Quality Engineering, Hubei University of Technology, Nanhu Avenue 28, Wuhan, 430068, Hubei, China
Yang Zhou
Yang Zhou
Huazhong University of Science and Technology
PhotovoltaicsHalide segregationDefect activityCharge carrier dynamicsImaging
R
Ruoyi Hao
Department of Electronic Engineering, The Chinese University of Hong Kong, Shatin, Hong Kong, 999077, China
Y
Yang Zhang
Hubei Key Laboratory of Modern Manufacturing Quality Engineering, Hubei University of Technology, Nanhu Avenue 28, Wuhan, 430068, Hubei, China
Hongliang Ren
Hongliang Ren
Chinese University of Hong Kong | National University of Singapore | JHU/Harvard(RF) | CUHK(PhD)
Biorobotics & intelligent systemsmedical mechatronicscontinuumsoft flexible robots/sensorsmultisensory perception