Large Language Models and Non-Negative Matrix Factorization for Bioacoustic Signal Decomposition

๐Ÿ“… 2025-07-12
๐Ÿ“ˆ Citations: 0
โœจ Influential: 0
๐Ÿ“„ PDF
๐Ÿค– AI Summary
Overlapping biophysical sounds (e.g., heart and respiratory sounds) in clinical auscultation are challenging to separate and interpret, limiting the explainability of automated diagnostic systems. To address this, we propose an unsupervised fusion framework: first, non-negative matrix factorization (NMF) is applied to time-frequency decouple mixed audio signals acquired via digital stethoscopes; secondโ€”novellyโ€”we leverage large language models (LLMs) to map NMF basis vectors directly to clinically interpretable physiological or pathological semantics (e.g., atrial fibrillation, bronchial breath sounds), without labeled data or prior medical knowledge. Experiments under simulated clinical conditions demonstrate robust separation of overlapping sounds, accurate identification of characteristic pathological acoustics, and significantly improved alignment between decomposition outcomes and clinical diagnostic reasoning. This work establishes a new paradigm for explainable AI-assisted auscultation.

Technology Category

Application Category

๐Ÿ“ Abstract
Large language models have shown a remarkable ability to extract meaning from unstructured data, offering new ways to interpret biomedical signals beyond traditional numerical methods. In this study, we present a matrix factorization framework for bioacoustic signal analysis which is enhanced by large language models. The focus is on separating bioacoustic signals that commonly overlap in clinical recordings, using matrix factorization to decompose the mixture into interpretable components. A large language model is then applied to the separated signals to associate distinct acoustic patterns with potential medical conditions such as cardiac rhythm disturbances or respiratory abnormalities. Recordings were obtained from a digital stethoscope applied to a clinical manikin to ensure a controlled and high-fidelity acquisition environment. This hybrid approach does not require labeled data or prior knowledge of source types, and it provides a more interpretable and accessible framework for clinical decision support. The method demonstrates promise for integration into future intelligent diagnostic tools.
Problem

Research questions and friction points this paper is trying to address.

Decompose overlapping bioacoustic signals in clinical recordings
Associate acoustic patterns with medical conditions using LLMs
Enable interpretable clinical decision support without labeled data
Innovation

Methods, ideas, or system contributions that make the work stand out.

Combines large language models with matrix factorization
Decomposes bioacoustic signals without labeled data
Links acoustic patterns to medical conditions automatically
๐Ÿ”Ž Similar Papers
No similar papers found.
Yasaman Torabi
Yasaman Torabi
PhD Electrical & Computer Engineering
Shahram Shirani
Shahram Shirani
McMaster University
Image and video processing
J
James P. Reilly
Electrical and Computer Engineering Department, McMaster University, Hamilton, Canada