On the Expressiveness of Rational ReLU Neural Networks With Bounded Depth

📅 2025-02-10

📈 Citations: 0

✨ Influential: 0

career value

192K/year

🤖 AI Summary

This paper investigates the minimum depth required for rational-weight ReLU networks to exactly represent the function $F_n = max{0, x_1, dots, x_n}$. Focusing on finite-precision weights—specifically, decimal and general $N$-ary fractional representations—the authors develop a novel approach combining combinatorial reasoning with piecewise linear analysis. They rigorously establish that for ternary fractional weights, the depth lower bound is $lceil log_3(n+1) ceil$, and for arbitrary $N$-ary fractional weights, it is $Omegaleft(frac{ln n}{ln ln N} ight)$. This constitutes the first nontrivial, superconstant depth lower bound for ReLU networks with practical finite-precision (non-floating-point, non-real) weights. The result not only generalizes and strengthens the classical $log_2(n+1)$ lower bound for real-valued weights but also provides foundational insights into the quantitative trade-off between weight precision and expressive power in neural networks.

Technology Category

Application Category

📝 Abstract

To confirm that the expressive power of ReLU neural networks grows with their depth, the function $F_n = max {0,x_1,ldots,x_n}$ has been considered in the literature. A conjecture by Hertrich, Basu, Di Summa, and Skutella [NeurIPS 2021] states that any ReLU network that exactly represents $F_n$ has at least $lceillog_2 (n+1) ceil$ hidden layers. The conjecture has recently been confirmed for networks with integer weights by Haase, Hertrich, and Loho [ICLR 2023]. We follow up on this line of research and show that, within ReLU networks whose weights are decimal fractions, $F_n$ can only be represented by networks with at least $lceillog_3 (n+1) ceil$ hidden layers. Moreover, if all weights are $N$-ary fractions, then $F_n$ can only be represented by networks with at least $Omega( frac{ln n}{ln ln N})$ layers. These results are a partial confirmation of the above conjecture for rational ReLU networks, and provide the first non-constant lower bound on the depth of practically relevant ReLU networks.

Problem

Research questions and friction points this paper is trying to address.

Expressive power of ReLU networks

Depth requirements for function representation

Lower bounds with rational weights

Innovation

Methods, ideas, or system contributions that make the work stand out.

ReLU networks with decimal fractions

Minimum hidden layers for $F_n$

Non-constant depth lower bounds

🔎 Similar Papers

Compelling ReLU Networks to Exhibit Exponentially Many Linear Regions at Initialization and During Training