On the Role of Depth in the Expressivity of RNNs

📅 2026-04-02
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This study investigates the impact of depth on the expressive power of recurrent neural networks (RNNs), with a focus on its relationship to memory capacity and the complexity of input transformations. Through theoretical analysis and empirical validation, the work formally establishes—for the first time—that increasing network depth efficiently enhances the memory capacity of RNNs. The analysis is further extended to second-order RNNs (2RNNs), revealing that multiplicative interactions between hidden states and inputs endow deep 2RNNs with polynomially growing transformation capabilities that cannot be replicated by stacking layers with standard nonlinear activations alone. Experiments on both synthetic and real-world tasks demonstrate the pronounced advantage of deep RNNs and 2RNNs in modeling high-order temporal dependencies.
📝 Abstract
The benefits of depth in feedforward neural networks are well known: composing multiple layers of linear transformations with nonlinear activations enables complex computations. While similar effects are expected in recurrent neural networks (RNNs), it remains unclear how depth interacts with recurrence to shape expressive power. Here, we formally show that depth increases RNNs' memory capacity efficiently with respect to the number of parameters, thus enhancing expressivity both by enabling more complex input transformations and improving the retention of past information. We broaden our analysis to 2RNNs, a generalization of RNNs with multiplicative interactions between inputs and hidden states. Unlike RNNs, which remain linear without nonlinear activations, 2RNNs perform polynomial transformations whose maximal degree grows with depth. We further show that multiplicative interactions cannot, in general, be replaced by layerwise nonlinearities. Finally, we validate these insights empirically on synthetic and real-world tasks.
Problem

Research questions and friction points this paper is trying to address.

depth
expressivity
recurrent neural networks
memory capacity
multiplicative interactions
Innovation

Methods, ideas, or system contributions that make the work stand out.

depth
expressivity
RNNs
2RNNs
multiplicative interactions
🔎 Similar Papers
No similar papers found.
M
Maude Lizaire
Mila & DIRO, Université de Montréal
M
Michael Rizvi-Martel
Mila & DIRO, Université de Montréal
É
Éric Dupuis
Independent, Mila & DIRO, CIFAR AI Chair, Université de Montréal
Guillaume Rabusseau
Guillaume Rabusseau
Assistant Professor - Canada CIFAR AI Chair, Université de Montréal / Mila
Machine LearningTensorsWeighted AutomataTensor Networks