🤖 AI Summary
This study addresses the limitations of existing respiratory sound classification models, which suffer from small-scale, low-diversity datasets and insufficient generalization due to high prediction correlation among base models trained on overlapping data. To overcome these challenges, the authors propose a meta-ensemble learning framework that enhances model diversity by training base models under distinct data partitions—fixed split versus five-fold cross-validation—and at different granularities—patient-level versus sample-level. A learnable meta-model is then introduced to fuse the outputs of these diverse base models. The proposed approach achieves a state-of-the-art score of 66.49% on the ICBHI benchmark and demonstrates superior out-of-distribution generalization on two external datasets, highlighting its potential for real-world clinical deployment.
📝 Abstract
Training reliable respiratory sound classification models remains challenging due to the limited size and subject diversity of datasets. Ensemble methods can improve robustness, but when base models are trained on identical data, models tend to overfit and produce highly correlated predictions, thereby reducing the effectiveness of ensembling. In this work, we investigate a meta-ensemble learning methodology that enhances prediction diversity by training base models on diverse data splits and combining their outputs through a trained meta-model. Specifically, we train base models on the ICBHI dataset using two data split settings: fixed 80-20% split and five-fold cross-validation split, under two data granularity settings: patient- and sample-level. The resulting diversity in base model predictions enables the meta-model to better generalize. Our approach achieves new state-of-the-art performance on the ICBHI benchmark, reaching a Score of 66.49% and showing improved generalization on two out-of-distribution datasets, indicating its potential applicability to real-world clinical data.