🤖 AI Summary
This work addresses the long-standing lack of a systematic theoretical foundation for deep neural networks (DNNs), which has hindered their interpretability and controllable advancement. By innovatively adopting an ordinary differential equation (ODE) perspective, the study integrates dynamical systems theory and numerical analysis to formulate a continuous-time modeling framework for DNNs. It rigorously establishes correspondence between differential equations and network architectures at both the global structural and individual layer levels. This unified theoretical framework provides, for the first time, a coherent explanation of DNN design principles, performance characteristics, and optimization mechanisms. Consequently, it substantially enhances model interpretability and opens new avenues for algorithmic improvements and cross-domain applications.
📝 Abstract
Deep neural networks (DNNs) have achieved remarkable empirical success, yet the absence of a principled theoretical foundation continues to hinder their systematic development. In this survey, we present differential equations as a theoretical foundation for understanding, analyzing, and improving DNNs. We organize the discussion around three guiding questions: i) how differential equations offer a principled understanding of DNN architectures, ii) how tools from differential equations can be used to improve DNN performance in a principled way, and iii) what real-world applications benefit from grounding DNNs in differential equations. We adopt a two-fold perspective spanning the model level, which interprets the whole DNN as a differential equation, and the layer level, which models individual DNN components as differential equations. From these two perspectives, we review how this framework connects model design, theoretical analysis, and performance improvement. We further discuss real-world applications, as well as key challenges and opportunities for future research.