🤖 AI Summary
To address data scarcity in biomedical and bioacoustic signal analysis, parameter redundancy and training inefficiency of conventional gated RNNs, and limited modeling capacity of linear RNNs, this paper proposes the Parallel Delay Memory Unit (PDMU). PDMU integrates gated delay lines, the Legendre Memory Unit (LMU) compression mechanism, and causal attention to enable efficient short-term temporal modeling. It further introduces gated skip connections to enhance early representation preservation and long-term memory retention under low-information regimes, and extends to bidirectional, computationally efficient, and spiking variants. The architecture is modular, scalable, and computationally lightweight. Evaluated across diverse audio and biomedical benchmark tasks, PDMU achieves significant improvements in memory capacity and predictive performance—particularly under few-shot conditions—demonstrating both effectiveness and state-of-the-art capability.
📝 Abstract
Advanced deep learning architectures, particularly recurrent neural networks (RNNs), have been widely applied in audio, bioacoustic, and biomedical signal analysis, especially in data-scarce environments. While gated RNNs remain effective, they can be relatively over-parameterised and less training-efficient in some regimes, while linear RNNs tend to fall short in capturing the complexity inherent in bio-signals. To address these challenges, we propose the Parallel Delayed Memory Unit (PDMU), a {delay-gated state-space module for short-term temporal credit assignment} targeting audio and bioacoustic signals, which enhances short-term temporal state interactions and memory efficiency via a gated delay-line mechanism. Unlike previous Delayed Memory Units (DMU) that embed temporal dynamics into the delay-line architecture, the PDMU further compresses temporal information into vector representations using Legendre Memory Units (LMU). This design serves as a form of causal attention, allowing the model to dynamically adjust its reliance on past states and improve real-time learning performance. Notably, in low-information scenarios, the gating mechanism behaves similarly to skip connections by bypassing state decay and preserving early representations, thereby facilitating long-term memory retention. The PDMU is modular, supporting parallel training and sequential inference, and can be easily integrated into existing linear RNN frameworks. Furthermore, we introduce bidirectional, efficient, and spiking variants of the architecture, each offering additional gains in performance or energy efficiency. Experimental results on diverse audio and biomedical benchmarks demonstrate that the PDMU significantly enhances both memory capacity and overall model performance.