🤖 AI Summary
This work investigates the minimal depth required for ReLU and maxout neural networks to exactly represent continuous piecewise-linear (PWL) functions on $mathbb{R}^d$, with a focus on networks compatible with the braid fan—a canonical polyhedral fan induced by the braid arrangement. Leveraging tools from polyhedral combinatorics, braid permutation theory, and maxout rank analysis, we establish the first nontrivial depth lower bound of $Omega(log log d)$ for braid-compatible networks. We provide a purely combinatorial proof that computing the $max(5)$ function necessitates at least three layers. Furthermore, we demonstrate that classical depth upper bounds are loose for maxout: $max(7)$ is realizable in two layers using maxout units of ranks 3 and 2. Our core contribution breaks a long-standing stagnation in depth lower bounds—previously stuck at 2 for general PWL functions—and reveals a double-logarithmic scaling between the intrinsic complexity of PWL functions and the minimal network depth required for their exact representation.
📝 Abstract
We contribute towards resolving the open question of how many hidden layers are required in ReLU networks for exactly representing all continuous and piecewise linear functions on $mathbb{R}^d$. While the question has been resolved in special cases, the best known lower bound in general is still 2. We focus on neural networks that are compatible with certain polyhedral complexes, more precisely with the braid fan. For such neural networks, we prove a non-constant lower bound of $Omega(loglog d)$ hidden layers required to exactly represent the maximum of $d$ numbers. Additionally, under our assumption, we provide a combinatorial proof that 3 hidden layers are necessary to compute the maximum of 5 numbers; this had only been verified with an excessive computation so far. Finally, we show that a natural generalization of the best known upper bound to maxout networks is not tight, by demonstrating that a rank-3 maxout layer followed by a rank-2 maxout layer is sufficient to represent the maximum of 7 numbers.