An Analysis Framework for Understanding Deep Neural Networks Based on Network Dynamics

📅 2025-01-05
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This paper addresses the fundamental challenges of interpretability and generalization mechanisms in deep neural networks (DNNs) by proposing a unified analytical framework grounded in neural dynamics. Methodologically, it introduces a novel integration of neuron-wise order-preserving versus non-order-preserving dynamical pattern classification with quantitative characterization of attraction basins in the sample-weight space—synthesizing order theory, information theory, and dynamical systems modeling. The framework provides the first unified explanation of diverse phenomena including “flat minima,” “grokking,” and the “double descent” curve, while enabling scalable analysis up to 100-layer networks. Its theoretical contribution lies in revealing cross-layer information optimization mechanisms in DNNs; practically, it informs principled selection of optimal depth and width. Empirical validation spans architectures up to 100 layers, establishing a rigorous foundation for understanding DNN generalization and guiding architecture design.

Technology Category

Application Category

📝 Abstract
Advancing artificial intelligence demands a deeper understanding of the mechanisms underlying deep learning. Here, we propose a straightforward analysis framework based on the dynamics of learning models. Neurons are categorized into two modes based on whether their transformation functions preserve order. This categorization reveals how deep neural networks (DNNs) maximize information extraction by rationally allocating the proportion of neurons in different modes across deep layers. We further introduce the attraction basins of the training samples in both the sample vector space and the weight vector space to characterize the generalization ability of DNNs. This framework allows us to identify optimal depth and width configurations, providing a unified explanation for fundamental DNN behaviors such as the"flat minima effect,""grokking,"and double descent phenomena. Our analysis extends to networks with depths up to 100 layers.
Problem

Research questions and friction points this paper is trying to address.

Deep Neural Networks
Optimal Configuration
Phenomena Explanation
Innovation

Methods, ideas, or system contributions that make the work stand out.

Deep Neural Networks
Neuron Classification
Performance Optimization
🔎 Similar Papers
No similar papers found.
Yuchen Lin
Yuchen Lin
Peking University
Computer Vision
Y
Yong Zhang
Department of Physics, Xiamen University, Xiamen 361005, China
S
Sihan Feng
Department of Physics, Xiamen University, Xiamen 361005, China
Hong Zhao
Hong Zhao
Department of Physics, Xiamen University, Xiamen 361005, China