🤖 AI Summary
This work addresses the theoretical fragmentation between deep and shallow neural networks regarding universal approximation. To unify their symmetry structures, we introduce the novel concept of “joint group-equivariant feature mappings,” which consistently characterizes architectures including fully connected networks and group convolutional networks. Leveraging group representation theory and the ridgelet transform, we establish the first constructive universal approximation theorem: we rigorously prove that four model classes—n-layer joint equivariant machines, n-layer fully connected networks, n-layer group convolutional networks, and a newly proposed two-layer quadratic network—achieve constructive approximation on compact domains. Moreover, we derive closed-form solutions for their parameter distributions. Our framework is the first to provide a unified modeling approach for universal approximation analysis across both deep and shallow networks, simultaneously ensuring theoretical generality and constructive implementability.
📝 Abstract
We present a constructive universal approximation theorem for learning machines equipped with joint-group-equivariant feature maps, called the joint-equivariant machines, based on the group representation theory."Constructive"here indicates that the distribution of parameters is given in a closed-form expression known as the ridgelet transform. Joint-group-equivariance encompasses a broad class of feature maps that generalize classical group-equivariance. Particularly, fully-connected networks are not group-equivariant but are joint-group-equivariant. Our main theorem also unifies the universal approximation theorems for both shallow and deep networks. Until this study, the universality of deep networks has been shown in a different manner from the universality of shallow networks, but our results discuss them on common ground. Now we can understand the approximation schemes of various learning machines in a unified manner. As applications, we show the constructive universal approximation properties of four examples: depth-$n$ joint-equivariant machine, depth-$n$ fully-connected network, depth-$n$ group-convolutional network, and a new depth-$2$ network with quadratic forms whose universality has not been known.