On the expressivity of deep Heaviside networks

📅 2025-04-30

📈 Citations: 0

✨ Influential: 0

career value

233K/year

🤖 AI Summary

This paper addresses the limited expressive power of deep Heaviside networks (DHNs). To enhance their capacity, we propose two structural augmentations: (i) incorporating skip connections and (ii) embedding linear-activation neurons. Theoretically, we establish the first rigorous bounds on the VC dimension and approximation rates for both vanilla DHNs and their augmented variants. We prove that standard DHNs achieve only suboptimal approximation orders, whereas augmented DHNs attain optimal rates—e.g., (O(n^{-1/d})) for Lipschitz functions in (d) dimensions. Within a nonparametric regression framework, we further derive an (O_mathbb{P}(n^{-2/(2+d)})) statistical convergence rate for the corresponding estimators. Collectively, these results provide the first comprehensive theoretical foundation for discrete-activation deep networks that simultaneously guarantees both optimal approximation capability and statistical learnability.

Technology Category

Application Category

📝 Abstract

We show that deep Heaviside networks (DHNs) have limited expressiveness but that this can be overcome by including either skip connections or neurons with linear activation. We provide lower and upper bounds for the Vapnik-Chervonenkis (VC) dimensions and approximation rates of these network classes. As an application, we derive statistical convergence rates for DHN fits in the nonparametric regression model.

Problem

Research questions and friction points this paper is trying to address.

Assessing expressivity limits of deep Heaviside networks

Improving expressiveness via skip connections or linear activations

Deriving VC dimension bounds and statistical convergence rates

Innovation

Methods, ideas, or system contributions that make the work stand out.

Deep Heaviside networks use skip connections

Linear activation neurons enhance expressiveness

Bounds provided for VC dimensions and rates

🔎 Similar Papers

Extracting Formulae in Many-Valued Logic from Deep Neural Networks