Neural Networks and (Virtual) Extended Formulations

📅 2024-11-05
🏛️ arXiv.org
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This paper establishes lower bounds on the minimal size of piecewise-linear neural networks (e.g., ReLU or maxout) required to solve linear programming (LP) problems. Methodologically, it forges a rigorous theoretical connection between network size and polyhedral extension complexity (xc), proving for the first time that xc(P) is a tight lower bound for monotone or input-convex networks solving LPs over a polytope P. Building on this, the paper introduces *virtual extension complexity* (vxc(P)), the first general-purpose lower-bound measure applicable to arbitrary ReLU/maxout networks. This framework extends lower-bound analysis from highly restricted architectures to general networks, yielding the first rigorous result that several polynomial-time solvable problems—including maximum-weight matching—require exponentially large networks. vxc(P) is efficiently computable, bridging theoretical depth with practical optimization relevance, and establishes a novel polyhedral-geometric paradigm for analyzing deep neural networks.

Technology Category

Application Category

📝 Abstract
Neural networks with piecewise linear activation functions, such as rectified linear units (ReLU) or maxout, are among the most fundamental models in modern machine learning. We make a step towards proving lower bounds on the size of such neural networks by linking their representative capabilities to the notion of the extension complexity $mathrm{xc}(P)$ of a polytope $P$. This is a well-studied quantity in combinatorial optimization and polyhedral geometry describing the number of inequalities needed to model $P$ as a linear program. We show that $mathrm{xc}(P)$ is a lower bound on the size of any monotone or input-convex neural network that solves the linear optimization problem over $P$. This implies exponential lower bounds on such neural networks for a variety of problems, including the polynomially solvable maximum weight matching problem. In an attempt to prove similar bounds also for general neural networks, we introduce the notion of virtual extension complexity $mathrm{vxc}(P)$, which generalizes $mathrm{xc}(P)$ and describes the number of inequalities needed to represent the linear optimization problem over $P$ as a difference of two linear programs. We prove that $mathrm{vxc}(P)$ is a lower bound on the size of any neural network that optimizes over $P$. While it remains an open question to derive useful lower bounds on $mathrm{vxc}(P)$, we argue that this quantity deserves to be studied independently from neural networks by proving that one can efficiently optimize over a polytope $P$ using a small virtual extended formulation.
Problem

Research questions and friction points this paper is trying to address.

Link neural network size to polytope extension complexity.
Introduce virtual extension complexity for neural networks.
Prove lower bounds on neural network optimization capabilities.
Innovation

Methods, ideas, or system contributions that make the work stand out.

Neural networks link extension complexity
Virtual extension complexity introduced
Lower bounds on network size
🔎 Similar Papers
No similar papers found.