Variable-frame CNNLSTM for Breast Nodule Classification using Ultrasound Videos

📅 2025-02-17
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Conventional key-frame methods for breast nodule ultrasound video classification neglect temporal dynamics, while fixed-length 3D CNNs suffer from inefficient modeling due to rigid frame constraints. Method: This work pioneers the adaptation of natural language processing (NLP) strategies for variable-length sequence handling to medical video analysis, proposing a variable-length CNN-LSTM architecture. It incorporates frame-order preservation, zero-padding, and invalid-frame compression; integrates CNN-based feature dimensionality reduction (to 1×512), dynamic batching, and LSTM-based temporal modeling—ensuring temporal integrity while markedly improving computational efficiency. Contribution/Results: Experiments demonstrate that the proposed method achieves a 3–6% improvement in F1-score and a 1.5% gain in specificity over baselines, outperforming both fixed-length CNN-LSTM and key-frame approaches in accuracy and overall classification performance—effectively alleviating clinical diagnostic performance bottlenecks.

Technology Category

Application Category

📝 Abstract
The intersection of medical imaging and artificial intelligence has become an important research direction in intelligent medical treatment, particularly in the analysis of medical images using deep learning for clinical diagnosis. Despite the advances, existing keyframe classification methods lack extraction of time series features, while ultrasonic video classification based on three-dimensional convolution requires uniform frame numbers across patients, resulting in poor feature extraction efficiency and model classification performance. This study proposes a novel video classification method based on CNN and LSTM, introducing NLP's long and short sentence processing scheme into video classification for the first time. The method reduces CNN-extracted image features to 1x512 dimension, followed by sorting and compressing feature vectors for LSTM training. Specifically, feature vectors are sorted by patient video frame numbers and populated with padding value 0 to form variable batches, with invalid padding values compressed before LSTM training to conserve computing resources. Experimental results demonstrate that our variable-frame CNNLSTM method outperforms other approaches across all metrics, showing improvements of 3-6% in F1 score and 1.5% in specificity compared to keyframe methods. The variable-frame CNNLSTM also achieves better accuracy and precision than equal-frame CNNLSTM. These findings validate the effectiveness of our approach in classifying variable-frame ultrasound videos and suggest potential applications in other medical imaging modalities.
Problem

Research questions and friction points this paper is trying to address.

Classifies breast nodules using ultrasound videos
Improves time series feature extraction efficiency
Enhances model performance with variable frame handling
Innovation

Methods, ideas, or system contributions that make the work stand out.

CNN and LSTM for video classification
Variable-frame processing with padding
Feature compression before LSTM training
🔎 Similar Papers
X
Xiangxiang Cui
The State Key Lab of Cognitive Neuroscience and Learning, Beijing Normal University
Z
Zhongyu Li
School of Software Engineering, Xi’an Jiaotong University
P
Peng Huang
Department of General Surgery, Xiangya Hospital, Central South University
Y
Ying Wang
Department of general surgery, The Second Hospital of Hebei Medical University
M
Meng Yang
Frontline Intelligent Technology (Nanjing) Co., Ltd.
Shi Chang
Shi Chang
Cornell University
active perceptionsensor motion planning
J
Jihua Zhu
School of Software Engineering, Xi’an Jiaotong University