On the Local Complexity of Linear Regions in Deep ReLU Networks

📅 2024-12-24
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This paper investigates the local complexity—defined as the density of linear regions—of deep ReLU networks with respect to the input distribution, and uncovers its intrinsic connections to feature learning, total variation (TV) of the network function, and adversarial robustness. Method: It introduces the novel concept of “local complexity” and rigorously establishes quantitative links between weight-space geometry and function-level properties—including TV and robustness—using piecewise-linear function theory, measure theory, and functional analysis. Contribution/Results: Theoretically, local complexity upper-bounds the function’s total variation; low-dimensional feature learning substantially reduces local complexity, thereby compressing representation cost; and optimization dynamics inherently favor low-complexity solutions. This framework provides a unified, quantifiable theoretical foundation for understanding implicit regularization, the geometric origin of adversarial robustness, and the geometric nature of feature learning in ReLU networks.

Technology Category

Application Category

📝 Abstract
We define the local complexity of a neural network with continuous piecewise linear activations as a measure of the density of linear regions over an input data distribution. We show theoretically that ReLU networks that learn low-dimensional feature representations have a lower local complexity. This allows us to connect recent empirical observations on feature learning at the level of the weight matrices with concrete properties of the learned functions. In particular, we show that the local complexity serves as an upper bound on the total variation of the function over the input data distribution and thus that feature learning can be related to adversarial robustness. Lastly, we consider how optimization drives ReLU networks towards solutions with lower local complexity. Overall, this work contributes a theoretical framework towards relating geometric properties of ReLU networks to different aspects of learning such as feature learning and representation cost.
Problem

Research questions and friction points this paper is trying to address.

Deep ReLU Networks
Feature Learning
Local Complexity
Innovation

Methods, ideas, or system contributions that make the work stand out.

Local Complexity
ReLU Networks
Feature Learning
Niket Patel
Niket Patel
New York University
Deep Learning TheoryInformation TheorySelf-Supervised LearningLLMsDiffusion Models
G
Guido Montúfar
Department of Mathematics and Department of Statistics & Data Science, University of California, Los Angeles, Los Angeles, CA 90095, USA; Max Planck Institute for Mathematics in the Sciences, Leipzig, Germany