🤖 AI Summary
This paper investigates the minimum depth required for rational-weight ReLU networks to exactly represent the function $F_n = max{0, x_1, dots, x_n}$. Focusing on finite-precision weights—specifically, decimal and general $N$-ary fractional representations—the authors develop a novel approach combining combinatorial reasoning with piecewise linear analysis. They rigorously establish that for ternary fractional weights, the depth lower bound is $lceil log_3(n+1)
ceil$, and for arbitrary $N$-ary fractional weights, it is $Omegaleft(frac{ln n}{ln ln N}
ight)$. This constitutes the first nontrivial, superconstant depth lower bound for ReLU networks with practical finite-precision (non-floating-point, non-real) weights. The result not only generalizes and strengthens the classical $log_2(n+1)$ lower bound for real-valued weights but also provides foundational insights into the quantitative trade-off between weight precision and expressive power in neural networks.
📝 Abstract
To confirm that the expressive power of ReLU neural networks grows with their depth, the function $F_n = max {0,x_1,ldots,x_n}$ has been considered in the literature. A conjecture by Hertrich, Basu, Di Summa, and Skutella [NeurIPS 2021] states that any ReLU network that exactly represents $F_n$ has at least $lceillog_2 (n+1)
ceil$ hidden layers. The conjecture has recently been confirmed for networks with integer weights by Haase, Hertrich, and Loho [ICLR 2023]. We follow up on this line of research and show that, within ReLU networks whose weights are decimal fractions, $F_n$ can only be represented by networks with at least $lceillog_3 (n+1)
ceil$ hidden layers. Moreover, if all weights are $N$-ary fractions, then $F_n$ can only be represented by networks with at least $Omega( frac{ln n}{ln ln N})$ layers. These results are a partial confirmation of the above conjecture for rational ReLU networks, and provide the first non-constant lower bound on the depth of practically relevant ReLU networks.