Meta-Ensemble Learning with Diverse Data Splits for Improved Respiratory Sound Classification

📅 2026-04-27
📈 Citations: 0
Influential: 0
📄 PDF

career value

195K/year
🤖 AI Summary
This study addresses the limitations of existing respiratory sound classification models, which suffer from small-scale, low-diversity datasets and insufficient generalization due to high prediction correlation among base models trained on overlapping data. To overcome these challenges, the authors propose a meta-ensemble learning framework that enhances model diversity by training base models under distinct data partitions—fixed split versus five-fold cross-validation—and at different granularities—patient-level versus sample-level. A learnable meta-model is then introduced to fuse the outputs of these diverse base models. The proposed approach achieves a state-of-the-art score of 66.49% on the ICBHI benchmark and demonstrates superior out-of-distribution generalization on two external datasets, highlighting its potential for real-world clinical deployment.

Technology Category

Application Category

📝 Abstract
Training reliable respiratory sound classification models remains challenging due to the limited size and subject diversity of datasets. Ensemble methods can improve robustness, but when base models are trained on identical data, models tend to overfit and produce highly correlated predictions, thereby reducing the effectiveness of ensembling. In this work, we investigate a meta-ensemble learning methodology that enhances prediction diversity by training base models on diverse data splits and combining their outputs through a trained meta-model. Specifically, we train base models on the ICBHI dataset using two data split settings: fixed 80-20% split and five-fold cross-validation split, under two data granularity settings: patient- and sample-level. The resulting diversity in base model predictions enables the meta-model to better generalize. Our approach achieves new state-of-the-art performance on the ICBHI benchmark, reaching a Score of 66.49% and showing improved generalization on two out-of-distribution datasets, indicating its potential applicability to real-world clinical data.
Problem

Research questions and friction points this paper is trying to address.

respiratory sound classification
limited dataset size
subject diversity
ensemble overfitting
prediction correlation
Innovation

Methods, ideas, or system contributions that make the work stand out.

meta-ensemble learning
diverse data splits
respiratory sound classification
out-of-distribution generalization
ICBHI dataset
🔎 Similar Papers
No similar papers found.
J
June-Woo Kim
Department of Electronic Engineering, Wonkwang University, Republic of Korea
M
Miika Toikkanen
RSC LAB, MODULABS, Republic of Korea
Heejoon Koo
Heejoon Koo
Unknown affiliation
Artificial Intelligence
Y
Yoon Tae Kim
RSC LAB, MODULABS, Republic of Korea
D
Doyoung Kwon
AICU Global Inc., Republic of Korea
K
Kyunghoon Kim
Seoul National University Bundang Hospital, Republic of Korea