Bidirectional Linear Recurrent Models for Sequence-Level Multisource Fusion

📅 2025-04-11
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address the quadratic time complexity of Transformers and the inability of RNNs to model bidirectional dependencies in long-term time series forecasting, this paper proposes the Bidirectional Linear Recurrent Unit (BLUR)—a novel sequence modeling architecture achieving both linear time complexity and explicit bidirectional dependency capture. BLUR integrates forward and backward linear recurrent units to jointly model historical and future contextual information, underpinned by a theoretically grounded stable initialization scheme ensuring training convergence. We further design an end-to-end multi-source time series fusion framework based on BLUR. Extensive experiments on image sequences and real-world meteorological and energy datasets demonstrate that BLUR outperforms both Transformers and conventional RNNs in forecasting accuracy, while accelerating training by 3.2× and reducing memory consumption by 57%. To our knowledge, BLUR is the first linear-time RNN variant to simultaneously surpass Transformers in both accuracy and efficiency.

Technology Category

Application Category

📝 Abstract
Sequence modeling is a critical yet challenging task with wide-ranging applications, especially in time series forecasting for domains like weather prediction, temperature monitoring, and energy load forecasting. Transformers, with their attention mechanism, have emerged as state-of-the-art due to their efficient parallel training, but they suffer from quadratic time complexity, limiting their scalability for long sequences. In contrast, recurrent neural networks (RNNs) offer linear time complexity, spurring renewed interest in linear RNNs for more computationally efficient sequence modeling. In this work, we introduce BLUR (Bidirectional Linear Unit for Recurrent network), which uses forward and backward linear recurrent units (LRUs) to capture both past and future dependencies with high computational efficiency. BLUR maintains the linear time complexity of traditional RNNs, while enabling fast parallel training through LRUs. Furthermore, it offers provably stable training and strong approximation capabilities, making it highly effective for modeling long-term dependencies. Extensive experiments on sequential image and time series datasets reveal that BLUR not only surpasses transformers and traditional RNNs in accuracy but also significantly reduces computational costs, making it particularly suitable for real-world forecasting tasks. Our code is available here.
Problem

Research questions and friction points this paper is trying to address.

Efficiently model long sequences with linear complexity
Capture bidirectional dependencies in time series data
Reduce computational costs while maintaining high accuracy
Innovation

Methods, ideas, or system contributions that make the work stand out.

Bidirectional Linear Recurrent Units (BLUR)
Linear time complexity with parallel training
Provably stable training for long sequences
🔎 Similar Papers
No similar papers found.
Qisai Liu
Qisai Liu
Iowa State University
Reinforcement LearningMachine Learning
Zhanhong Jiang
Zhanhong Jiang
Scientist at TrAC
Distributed optimizationMachine Learning
Joshua R. Waite
Joshua R. Waite
Postdoctoral Research Associate, Translational AI Center, Iowa State University
Machine LearningDeep Learning
C
Chao Liu
Department of Energy and Power Engineering, Tsinghua University, Haidian, 100084, Beijing, China
Aditya Balu
Aditya Balu
Iowa State University
S
Soumik Sarkar
Department of Mechanical Engineering, Iowa State University, Ames, 50011, Iowa, United States; Department of Computer Science, Iowa State University, Ames, 50011, Iowa, United States; Translational AI Center, Iowa State University, Ames, 50011, IA, United States