mGRADE: Minimal Recurrent Gating Meets Delay Convolutions for Lightweight Sequence Modeling

📅 2025-07-02

📈 Citations: 0

✨ Influential: 0

career value

182K/year

🤖 AI Summary

Addressing the challenge of memory-constrained temporal modeling on edge devices, this paper proposes mGRADE—a lightweight hybrid architecture. Methodologically, mGRADE integrates learnable-stride dilated convolutions (to capture multi-scale local dynamics) with a minimalist gated recurrent unit (minGRU) for long-range dependency modeling, forming a parallelizable, memory-efficient hybrid memory system. Crucially, it achieves constant-memory complexity during both training and inference and supports fully parallel training. Empirically, mGRADE outperforms pure convolutional and pure RNN baselines on synthetic sequence tasks and pixel-level image classification benchmarks, achieving comparable or superior accuracy with approximately 20% lower memory footprint. By enabling efficient co-modeling of short- and long-range temporal dynamics, mGRADE establishes a cost-effective, edge-deployable paradigm for sequence modeling.

Technology Category

Application Category

📝 Abstract

Edge devices for temporal processing demand models that capture both short- and long- range dynamics under tight memory constraints. While Transformers excel at sequence modeling, their quadratic memory scaling with sequence length makes them impractical for such settings. Recurrent Neural Networks (RNNs) offer constant memory but train sequentially, and Temporal Convolutional Networks (TCNs), though efficient, scale memory with kernel size. To address this, we propose mGRADE (mininally Gated Recurrent Architecture with Delay Embedding), a hybrid-memory system that integrates a temporal 1D-convolution with learnable spacings followed by a minimal gated recurrent unit (minGRU). This design allows the convolutional layer to realize a flexible delay embedding that captures rapid temporal variations, while the recurrent module efficiently maintains global context with minimal memory overhead. We validate our approach on two synthetic tasks, demonstrating that mGRADE effectively separates and preserves multi-scale temporal features. Furthermore, on challenging pixel-by-pixel image classification benchmarks, mGRADE consistently outperforms both pure convolutional and pure recurrent counterparts using approximately 20% less memory footprint, highlighting its suitability for memory-constrained temporal processing at the edge. This highlights mGRADE's promise as an efficient solution for memory-constrained multi-scale temporal processing at the edge.

Problem

Research questions and friction points this paper is trying to address.

Balancing memory and performance in edge device sequence modeling

Combining RNNs and TCNs for efficient temporal feature capture

Reducing memory usage while maintaining multi-scale temporal processing

Innovation

Methods, ideas, or system contributions that make the work stand out.

Hybrid-memory system with temporal 1D-convolution

Minimal gated recurrent unit for global context

Efficient multi-scale feature separation and preservation

🔎 Similar Papers

Chrono: A Simple Blueprint for Representing Time in MLLMs

2024-06-26Citations: 4

Non-autoregressive Sequence-to-Sequence Vision-Language Models

2024-03-04Computer Vision and Pattern RecognitionCitations: 3

Nvidia

base salary range is 152,000 USD - 241,500 USD for Level 3, and 184,000 USD - 287,500 USD for Level 4.

US, CA, Santa Clara / US, TX, Austin / US, CA, Remote

Authors to Follow