🤖 AI Summary
To address poor model generalizability in phonocardiogram (PCG) segmentation caused by scarce labeled data, this paper proposes TopSeg—a multiscale topological representation framework. TopSeg extracts persistent homology features (H₀ and H₁) across multiple scales to encode the structured dynamical properties of PCG signals, serving as a strong inductive bias. It integrates a lightweight temporal convolutional network (TCN) with a sequence- and duration-constrained decoding mechanism for robust segmentation. Evaluated on the PhysioNet 2016 training set and CirCor external validation set, TopSeg significantly outperforms spectrogram- and envelope-based baselines under extremely low-labeling budgets (e.g., <10% annotated data), while remaining competitive in the full-data setting. To our knowledge, this is the first work to systematically introduce topological data analysis into few-shot PCG segmentation, offering both theoretical interpretability and practical engineering viability.
📝 Abstract
Deep learning approaches for heart-sound (PCG) segmentation built on time--frequency features can be accurate but often rely on large expert-labeled datasets, limiting robustness and deployment. We present TopSeg, a topological representation-centric framework that encodes PCG dynamics with multi-scale topological features and decodes them using a lightweight temporal convolutional network (TCN) with an order- and duration-constrained inference step. To evaluate data efficiency and generalization, we train exclusively on PhysioNet 2016 dataset with subject-level subsampling and perform external validation on CirCor dataset. Under matched-capacity decoders, the topological features consistently outperform spectrogram and envelope inputs, with the largest margins at low data budgets; as a full system, TopSeg surpasses representative end-to-end baselines trained on their native inputs under the same budgets while remaining competitive at full data. Ablations at 10% training confirm that all scales contribute and that combining H_0 and H_1 yields more reliable S1/S2 localization and boundary stability. These results indicate that topology-aware representations provide a strong inductive bias for data-efficient, cross-dataset PCG segmentation, supporting practical use when labeled data are limited.