🤖 AI Summary
This work investigates the expressive power of one-dimensional ReLU deep neural networks (DNNs), specifically characterizing the number of linear regions they can realize.
Method: We introduce the notion of *function-adaptive sparsity*—a quantitative measure of how efficiently a network activates linear regions relative to the minimal number required to approximate a target function. Under random initialization with He scaling and nonzero biases, we perform a rigorous infinite-width limit analysis to derive an exact analytical expression for the expected number of linear regions.
Contribution/Results: We prove that the expected number of linear regions grows linearly with both total width (sum of hidden-layer widths) and depth, with the leading term being total width plus one. This is the first work to decouple and quantify the distinct, synergistic roles of width and depth in governing linear region count—revealing their joint contribution to functional complexity. Our results provide a novel theoretical framework for understanding the architecture–expressivity relationship in DNNs.
📝 Abstract
We study the expressivity of one-dimensional (1D) ReLU deep neural networks through the lens of their linear regions. For randomly initialized, fully connected 1D ReLU networks (He scaling with nonzero bias) in the infinite-width limit, we prove that the expected number of linear regions grows as $sum_{i = 1}^L n_i + mathop{o}left(sum_{i = 1}^L{n_i}
ight) + 1$, where $n_ell$ denotes the number of neurons in the $ell$-th hidden layer. We also propose a function-adaptive notion of sparsity that compares the expected regions used by the network to the minimal number needed to approximate a target within a fixed tolerance.