π€ AI Summary
This paper addresses the limited expressive power of deep Heaviside networks (DHNs). To enhance their capacity, we propose two structural augmentations: (i) incorporating skip connections and (ii) embedding linear-activation neurons. Theoretically, we establish the first rigorous bounds on the VC dimension and approximation rates for both vanilla DHNs and their augmented variants. We prove that standard DHNs achieve only suboptimal approximation orders, whereas augmented DHNs attain optimal ratesβe.g., (O(n^{-1/d})) for Lipschitz functions in (d) dimensions. Within a nonparametric regression framework, we further derive an (O_mathbb{P}(n^{-2/(2+d)})) statistical convergence rate for the corresponding estimators. Collectively, these results provide the first comprehensive theoretical foundation for discrete-activation deep networks that simultaneously guarantees both optimal approximation capability and statistical learnability.
π Abstract
We show that deep Heaviside networks (DHNs) have limited expressiveness but that this can be overcome by including either skip connections or neurons with linear activation. We provide lower and upper bounds for the Vapnik-Chervonenkis (VC) dimensions and approximation rates of these network classes. As an application, we derive statistical convergence rates for DHN fits in the nonparametric regression model.