On the Stability of the Jacobian Matrix in Deep Neural Networks

📅 2025-06-10
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Deepening neural networks often causes gradient explosion or vanishing, rooted in spectral instability of the input-output Jacobian matrix. Existing stability theories are limited to fully connected architectures and i.i.d. weight assumptions, failing to characterize sparsity induced by pruning or weak dependencies among weights arising from training. Method: We develop a general Jacobian spectral stability theorem applicable to sparse connectivity and non-i.i.d. (weakly dependent) weights, integrating random matrix theory, dependency modeling, and structured spectral analysis to formulate a more realistic initialization framework. Contribution/Results: Our theory provides rigorous, verifiable spectral stability guarantees for both pruned models and post-training networks—addressing critical gaps left by classical initialization theory. It significantly extends the applicability boundary of deep network initialization theory, enabling principled design and analysis of modern sparse and trained architectures.

Technology Category

Application Category

📝 Abstract
Deep neural networks are known to suffer from exploding or vanishing gradients as depth increases, a phenomenon closely tied to the spectral behavior of the input-output Jacobian. Prior work has identified critical initialization schemes that ensure Jacobian stability, but these analyses are typically restricted to fully connected networks with i.i.d. weights. In this work, we go significantly beyond these limitations: we establish a general stability theorem for deep neural networks that accommodates sparsity (such as that introduced by pruning) and non-i.i.d., weakly correlated weights (e.g. induced by training). Our results rely on recent advances in random matrix theory, and provide rigorous guarantees for spectral stability in a much broader class of network models. This extends the theoretical foundation for initialization schemes in modern neural networks with structured and dependent randomness.
Problem

Research questions and friction points this paper is trying to address.

Analyzing Jacobian stability in deep neural networks
Extending stability theory to sparse and non-i.i.d. weights
Providing spectral guarantees for modern network architectures
Innovation

Methods, ideas, or system contributions that make the work stand out.

General stability theorem for deep neural networks
Accommodates sparsity and non-i.i.d. weights
Uses random matrix theory for spectral stability
🔎 Similar Papers
No similar papers found.