🤖 AI Summary
Current auscultation-based diagnosis suffers from substantial inter-observer variability and poor generalizability of AI models in resource-constrained settings, hindering early disease detection. To address these challenges, we introduce the first integrated foundational framework for cardiorespiratory and bowel sound analysis—comprising (1) AuscultaBase-Corpus, a large-scale, multi-source acoustic corpus (322+ hours); (2) AuscultaBase-Model, a contrastive learning–driven universal body-sound foundation model; and (3) AuscultaBase-Bench, a unified evaluation benchmark covering 16 diagnostic subtasks. Leveraging multi-source data fusion, self-supervised contrastive pretraining, and cross-domain transfer evaluation, our framework achieves statistically significant improvements over existing open-source acoustic pretrained models on 12 of 16 tasks. It markedly enhances cardiac, respiratory, and bowel sound classification, anomaly detection, and lesion localization—advancing objective, standardized phonocardiographic and phonopneumographic analysis.
📝 Abstract
Auscultation of internal body sounds is essential for diagnosing a range of health conditions, yet its effectiveness is often limited by clinicians' expertise and the acoustic constraints of human hearing, restricting its use across various clinical scenarios. To address these challenges, we introduce AuscultaBase, a foundational framework aimed at advancing body sound diagnostics through innovative data integration and contrastive learning techniques. Our contributions include the following: First, we compile AuscultaBase-Corpus, a large-scale, multi-source body sound database encompassing 11 datasets with 40,317 audio recordings and totaling 322.4 hours of heart, lung, and bowel sounds. Second, we develop AuscultaBase-Model, a foundational diagnostic model for body sounds, utilizing contrastive learning on the compiled corpus. Third, we establish AuscultaBase-Bench, a comprehensive benchmark containing 16 sub-tasks, assessing the performance of various open-source acoustic pre-trained models. Evaluation results indicate that our model outperforms all other open-source models in 12 out of 16 tasks, demonstrating the efficacy of our approach in advancing diagnostic capabilities for body sound analysis.