Position: A Theory of Deep Learning Must Include Compositional Sparsity

📅 2025-07-03
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This paper addresses the fundamental question of why deep neural networks (DNNs) achieve efficient learning in high-dimensional tasks. We propose *compositional sparsity*—the property that target functions admit representations as combinations of a small number of low-dimensional component functions—as the key mechanism underlying their success. Unlike conventional sparsity assumptions, we rigorously establish, for the first time, that all efficiently Turing-computable functions inherently possess this structure, and that DNNs implicitly exploit it to attain superior approximation and generalization over shallow architectures. Integrating tools from approximation theory, computational complexity analysis, and learning dynamics modeling, we reveal how overparameterized networks leverage compositional sparsity to enable efficient optimization and low sample-complexity learning. Our results provide a novel unifying framework for deep learning theory and advance foundational understanding of generalization, optimization dynamics, and the theoretical principles of artificial intelligence.

Technology Category

Application Category

📝 Abstract
Overparametrized Deep Neural Networks (DNNs) have demonstrated remarkable success in a wide variety of domains too high-dimensional for classical shallow networks subject to the curse of dimensionality. However, open questions about fundamental principles, that govern the learning dynamics of DNNs, remain. In this position paper we argue that it is the ability of DNNs to exploit the compositionally sparse structure of the target function driving their success. As such, DNNs can leverage the property that most practically relevant functions can be composed from a small set of constituent functions, each of which relies only on a low-dimensional subset of all inputs. We show that this property is shared by all efficiently Turing-computable functions and is therefore highly likely present in all current learning problems. While some promising theoretical insights on questions concerned with approximation and generalization exist in the setting of compositionally sparse functions, several important questions on the learnability and optimization of DNNs remain. Completing the picture of the role of compositional sparsity in deep learning is essential to a comprehensive theory of artificial, and even general, intelligence.
Problem

Research questions and friction points this paper is trying to address.

Understanding how DNNs exploit compositional sparsity in functions
Exploring learnability and optimization gaps in sparse DNNs
Establishing compositional sparsity's role in general intelligence theory
Innovation

Methods, ideas, or system contributions that make the work stand out.

Exploiting compositional sparsity in DNNs
Leveraging low-dimensional input subsets
Targeting Turing-computable function properties
🔎 Similar Papers
D
David A. Danhofer
ETH Zurich, Zurich, Switzerland
Davide D'Ascenzo
Davide D'Ascenzo
PhD Student, Università degli Studi di Milano
geometric deep learningtheoretical deep learninggraph neural networksdynamical systems
R
Rafael Dubach
University of Zurich, Zurich, Switzerland
T
Tomaso Poggio
Center for Brains, Minds and Machines (CBMM), MIT, Cambridge, MA, USA