🤖 AI Summary
To address the challenges of scarce manually annotated data and high inter-subject variability in diagnosing pulmonary diseases (e.g., lung cancer, COPD), which severely limit model generalizability, this paper proposes a semi-supervised lung sound analysis framework. Building upon MFCC feature extraction and a CNN-based classifier, the method innovatively integrates three complementary semi-supervised modules—MixMatch, Co-Refinement, and Co-Refurbishing—to jointly optimize training using limited labeled and abundant unlabeled lung sound recordings. Experimental results demonstrate that the proposed approach achieves 92.9% classification accuracy under constrained labeling budgets, outperforming the supervised baseline by 3.8 percentage points. It significantly enhances model robustness and clinical applicability, offering a scalable technical pathway for intelligent auscultation in low-resource healthcare settings.
📝 Abstract
Lung diseases, including lung cancer and COPD, are significant health concerns globally. Traditional diagnostic methods can be costly, time-consuming, and invasive. This study investigates the use of semi supervised learning methods for lung sound signal detection using a model combination of MFCC+CNN. By introducing semi supervised learning modules such as Mix Match, Co-Refinement, and Co Refurbishing, we aim to enhance the detection performance while reducing dependence on manual annotations. With the add-on semi-supervised modules, the accuracy rate of the MFCC+CNN model is 92.9%, an increase of 3.8% to the baseline model. The research contributes to the field of lung disease sound detection by addressing challenges such as individual differences, feature insufficient labeled data.