When Learning Hurts: Fixed-Pole RNN for Real-Time Online Training

📅 2026-02-24

📈 Citations: 0

✨ Influential: 0

career value

181K/year

🤖 AI Summary

This work addresses the challenges of joint optimization of poles and weights in recurrent neural networks (RNNs) under data-scarce, real-time online learning scenarios, where such co-optimization often leads to highly non-convex landscapes, training instability, and poor convergence. To circumvent the optimization complexity introduced by learning poles, the authors propose a fixed-pole RNN architecture inspired by echo state networks (ESNs). Through theoretical analysis and empirical validation—including comparisons across multiple optimizers and the use of complex-domain gradient descent—the study demonstrates that fixed poles yield more stable and well-conditioned internal state representations. Under limited data conditions, the proposed approach consistently outperforms learnable-pole models, achieving faster convergence and superior performance, thereby highlighting its practical utility for real-time online learning tasks.

Technology Category

Application Category

📝 Abstract

Recurrent neural networks (RNNs) can be interpreted as discrete-time state-space models, where the state evolution corresponds to an infinite-impulse-response (IIR) filtering operation governed by both feedforward weights and recurrent poles. While, in principle, all parameters including pole locations can be optimized via backpropagation through time (BPTT), such joint learning incurs substantial computational overhead and is often impractical for applications with limited training data. Echo state networks (ESNs) mitigate this limitation by fixing the recurrent dynamics and training only a linear readout, enabling efficient and stable online adaptation. In this work, we analytically and empirically examine why learning recurrent poles does not provide tangible benefits in data-constrained, real-time learning scenarios. Our analysis shows that pole learning renders the weight optimization problem highly non-convex, requiring significantly more training samples and iterations for gradient-based methods to converge to meaningful solutions. Empirically, we observe that for complex-valued data, gradient descent frequently exhibits prolonged plateaus, and advanced optimizers offer limited improvement. In contrast, fixed-pole architectures induce stable and well-conditioned state representations even with limited training data. Numerical results demonstrate that fixed-pole networks achieve superior performance with lower training complexity, making them more suitable for online real-time tasks.

Problem

Research questions and friction points this paper is trying to address.

recurrent neural networks

pole learning

real-time online training

data-constrained learning

non-convex optimization

Innovation

Methods, ideas, or system contributions that make the work stand out.

Fixed-Pole RNN

Online Learning

Non-convex Optimization