Text-Guided Multi-Instance Learning for Scoliosis Screening via Gait Video Analysis

📅 2025-07-01
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Early-onset scoliosis in adolescents is often asymptomatic and challenging to detect at scale: X-ray screening entails ionizing radiation exposure and relies heavily on expert interpretation. To address these limitations, we propose a non-invasive, video-based gait analysis method. First, dynamic time warping (DTW) clustering is employed to segment gait cycles into discriminative phases. Second, we introduce a text-guided multiple-instance learning framework featuring inter-bag temporal attention (IBTA) to emphasize diagnostically salient gait intervals, coupled with a boundary-aware model (BAM) to improve detection of borderline cases. Third, clinical textual knowledge is integrated with large language models (LLMs) to enhance semantic representation and interpretability. Evaluated on the Scoliosis1K dataset, our approach significantly outperforms existing methods—particularly under class imbalance and in detecting mild abnormalities—demonstrating strong potential for clinical deployment.

Technology Category

Application Category

📝 Abstract
Early-stage scoliosis is often difficult to detect, particularly in adolescents, where delayed diagnosis can lead to serious health issues. Traditional X-ray-based methods carry radiation risks and rely heavily on clinical expertise, limiting their use in large-scale screenings. To overcome these challenges, we propose a Text-Guided Multi-Instance Learning Network (TG-MILNet) for non-invasive scoliosis detection using gait videos. To handle temporal misalignment in gait sequences, we employ Dynamic Time Warping (DTW) clustering to segment videos into key gait phases. To focus on the most relevant diagnostic features, we introduce an Inter-Bag Temporal Attention (IBTA) mechanism that highlights critical gait phases. Recognizing the difficulty in identifying borderline cases, we design a Boundary-Aware Model (BAM) to improve sensitivity to subtle spinal deviations. Additionally, we incorporate textual guidance from domain experts and large language models (LLM) to enhance feature representation and improve model interpretability. Experiments on the large-scale Scoliosis1K gait dataset show that TG-MILNet achieves state-of-the-art performance, particularly excelling in handling class imbalance and accurately detecting challenging borderline cases. The code is available at https://github.com/lhqqq/TG-MILNet
Problem

Research questions and friction points this paper is trying to address.

Detect early-stage scoliosis non-invasively via gait videos
Address temporal misalignment in gait sequence analysis
Improve sensitivity to subtle spinal deviations in borderline cases
Innovation

Methods, ideas, or system contributions that make the work stand out.

Text-guided multi-instance learning for scoliosis detection
Dynamic Time Warping clusters key gait phases
Boundary-Aware Model enhances sensitivity to deviations
🔎 Similar Papers
No similar papers found.
H
Haiqing Li
Department of Computer Science and Engineering, The University of Texas at Arlington, Arlington, TX 76019, USA
Yuzhi Guo
Yuzhi Guo
University of Texas at Arlington
Deep LearningBioinformatics
F
Feng Jiang
Department of Computer Science and Engineering, The University of Texas at Arlington, Arlington, TX 76019, USA
Thao M. Dang
Thao M. Dang
PhD student, University of Texas at Arlington
BiometricsCryptographyMedical Image Processing
Hehuan Ma
Hehuan Ma
University of Texas at Arlington
Machine LearningDeep LearningGraph Neural Network
Q
Qifeng Zhou
Department of Computer Science and Engineering, The University of Texas at Arlington, Arlington, TX 76019, USA
J
Jean Gao
Department of Computer Science and Engineering, The University of Texas at Arlington, Arlington, TX 76019, USA
Junzhou Huang
Junzhou Huang
Jenkins Garrett Professor, Computer Science and Engineering, the University of Texas at Arlington
Machine LearningMedical Image AnalysisGraph Neural NetworksComputational Toxicology