LMU-Based Sequential Learning and Posterior Ensemble Fusion for Cross-Domain Infant Cry Classification

📅 2026-02-24

📈 Citations: 0

✨ Influential: 0

career value

219K/year

🤖 AI Summary

This work addresses key challenges in infant cry classification—namely, the short-duration non-stationarity of cry signals, scarcity of labeled data, and domain shifts across infants and datasets—by proposing a multimodal fusion framework with efficient temporal modeling. The approach integrates MFCC, STFT, and fundamental frequency features through a multi-branch CNN encoder and replaces conventional LSTMs with parameter-efficient Legendre Memory Units (LMUs) to robustly capture temporal dynamics. An entropy-weighted posterior ensemble mechanism is further introduced to mitigate domain shift while preserving domain-specific knowledge. Cross-dataset evaluations on Baby2020 and Baby Crying demonstrate substantial improvements in macro F1-score, support real-time deployment, and validate effectiveness under a leakage-aware data split.

Technology Category

Application Category

📝 Abstract

Decoding infant cry causes remains challenging for healthcare monitoring due to short nonstationary signals, limited annotations, and strong domain shifts across infants and datasets. We propose a compact acoustic framework that fuses MFCC, STFT, and pitch features within a multi-branch CNN encoder and models temporal dynamics using an enhanced Legendre Memory Unit (LMU). Compared to LSTMs, the LMU backbone provides stable sequence modeling with substantially fewer recurrent parameters, supporting efficient deployment. To improve cross-dataset generalization, we introduce calibrated posterior ensemble fusion with entropy-gated weighting to preserve domain-specific expertise while mitigating dataset bias. Experiments on Baby2020 and Baby Crying demonstrate improved macro-F1 under cross-domain evaluation, along with leakageaware splits and real-time feasibility for on-device monitoring.

Problem

Research questions and friction points this paper is trying to address.

infant cry classification

cross-domain

domain shift

limited annotations

nonstationary signals

Innovation

Methods, ideas, or system contributions that make the work stand out.

Legendre Memory Unit

cross-domain classification

posterior ensemble fusion