Mathematical Modeling and Convergence Analysis of Deep Neural Networks with Dense Layer Connectivities in Deep Learning

📅 2025-10-02
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Deep fully connected neural networks (DNNs) lack rigorous mathematical modeling and convergence analysis in the infinite-depth limit. Method: This paper introduces the Dense Nonlocal (DNL) framework, the first to model dense layers as systems of nonlinear integral equations and characterize training dynamics from an optimal control perspective. It integrates optimal control theory, piecewise-linear extensions, and Γ-convergence analysis. Contributions/Results: We rigorously prove: (1) the empirical risk optimum converges as depth tends to infinity; and (2) a subsequence of corresponding minimizers converges weakly to a solution of the continuous-time optimal control problem. The DNL framework reveals the intrinsic stability mechanism of dense connectivity in the deep limit and establishes the first convergence guarantee—and foundational theoretical basis—for fully connected DNNs grounded in a continuous-depth limit.

Technology Category

Application Category

📝 Abstract
In deep learning, dense layer connectivity has become a key design principle in deep neural networks (DNNs), enabling efficient information flow and strong performance across a range of applications. In this work, we model densely connected DNNs mathematically and analyze their learning problems in the deep-layer limit. For a broad applicability, we present our analysis in a framework setting of DNNs with densely connected layers and general non-local feature transformations (with local feature transformations as special cases) within layers, which is called dense non-local (DNL) framework and includes standard DenseNets and variants as special examples. In this formulation, the densely connected networks are modeled as nonlinear integral equations, in contrast to the ordinary differential equation viewpoint commonly adopted in prior works. We study the associated training problems from an optimal control perspective and prove convergence results from the network learning problem to its continuous-time counterpart. In particular, we show the convergence of optimal values and the subsequence convergence of minimizers, using a piecewise linear extension and $Γ$-convergence analysis. Our results provide a mathematical foundation for understanding densely connected DNNs and further suggest that such architectures can offer stability of training deep models.
Problem

Research questions and friction points this paper is trying to address.

Modeling densely connected deep neural networks as nonlinear integral equations
Analyzing convergence from discrete learning to continuous optimal control
Establishing mathematical foundation for training stability in deep architectures
Innovation

Methods, ideas, or system contributions that make the work stand out.

Modeled dense DNNs as nonlinear integral equations
Analyzed training from optimal control perspective
Proved convergence using piecewise linear extension
🔎 Similar Papers
2024-07-25arXiv.orgCitations: 0