Towards an Optimal Control Perspective of ResNet Training

📅 2025-06-26

📈 Citations: 0

✨ Influential: 0

career value

174K/year

🤖 AI Summary

This work formulates ResNet training as a continuous-time optimal control problem, unifying standard residual architectures with arbitrary differentiable loss functions. Methodologically, it introduces stage-wise cost regularization on intermediate hidden-state outputs, inducing dynamic evolution of residual pathways and naturally promoting deep-layer weight sparsification. Theoretically, it establishes, for the first time, a rigorous correspondence between deep learning training and optimal control, revealing the intrinsic mechanism by which residual weights asymptotically vanish during optimization. Practically, this yields a novel, falsifiable, threshold-free layer-pruning paradigm grounded in first principles. Experiments demonstrate that the framework adaptively compresses network depth, substantially reducing redundant computation while preserving model accuracy—providing a new, theory-driven pathway toward efficient deep learning.

Technology Category

Application Category

📝 Abstract

We propose a training formulation for ResNets reflecting an optimal control problem that is applicable for standard architectures and general loss functions. We suggest bridging both worlds via penalizing intermediate outputs of hidden states corresponding to stage cost terms in optimal control. For standard ResNets, we obtain intermediate outputs by propagating the state through the subsequent skip connections and the output layer. We demonstrate that our training dynamic biases the weights of the unnecessary deeper residual layers to vanish. This indicates the potential for a theory-grounded layer pruning strategy.

Problem

Research questions and friction points this paper is trying to address.

Formulating ResNet training as optimal control problem

Penalizing hidden states to bridge training and control

Enabling theory-grounded pruning via vanishing deep layers

Innovation

Methods, ideas, or system contributions that make the work stand out.

Optimal control approach for ResNet training

Penalizing hidden states via skip connections

Theory-grounded layer pruning strategy

🔎 Similar Papers

No similar papers found.