🤖 AI Summary
This work investigates the learning problem and training convergence of forward–backward splitting (FBS) unfolded networks in the deep-layer limit, along with their stability properties. By constructing an unfolded neural network based on the FBS algorithm, the authors model its deep-limit behavior using a differential/difference inclusion framework and establish a rigorous convergence relationship between the network training process and the associated limit-system learning problem. They provide the first Γ-convergence guarantee for the trainable parameters of FBS-induced networks and offer a qualitative analysis of the learning problem’s stability under perturbations. Theoretically, they prove that any cluster point of the optimal network parameters converges to a solution of the limit-system learning problem, and numerical experiments corroborate the established convergence results.
📝 Abstract
Deep unfolding neural networks derived from iterative optimization schemes and numerical ordinary/partial differential equations (ODEs/PDEs) have attracted much attention in data science over the last decade. Therein, numerous important network architectures were constructed from the basic forward-backward-splitting (FBS) algorithm. In this paper, we continue our research on the most basic FBS-induced network, an architecture unrolled from the original FBS algorithm by incorporating direct parameter relaxations. Following the difference/differential inclusion formulations in our previous forward system analyses, we here consider some theoretical aspects of corresponding learning problems. Under some mild assumptions, we establish a general convergence property of the training problem of the basic FBS-induced network to the learning problem of the deep-layer limit system, implying a $Γ$-convergence argument showing that any cluster point of the optimal learning parameters for the network is a solution to the learning problem of the deep-layer limit system. A qualitative analysis of perturbation stabilities of these learning problems is also presented. A simple numerical experiment is conducted to validate our main general convergence result.