🤖 AI Summary
This work investigates the approximation capability of shallow neural networks employing neural ordinary differential equation (neural ODE) flows as activation functions. Motivated by stability requirements in practical deployment, we establish, for the first time, a rigorous approximation theory under two simultaneous structural constraints: Lipschitz continuity of the flow map and unit norm of linear layer weights. We prove that universal approximation property (UAP) on continuous function spaces is preserved even when imposing either constraint individually. Furthermore, we derive quantitative approximation error bounds under these constraints. Our key contribution lies in uncovering the fundamental trade-off between architectural constraints—namely, flow regularity and weight normalization—and expressive power. This analysis provides a theoretical foundation for designing lightweight, verifiable neural networks driven by neural ODEs, bridging stability guarantees with approximation fidelity.
📝 Abstract
We study the approximation properties of shallow neural networks whose activation function is defined as the flow of a neural ordinary differential equation (neural ODE) at the final time of the integration interval. We prove the universal approximation property (UAP) of such shallow neural networks in the space of continuous functions. Furthermore, we investigate the approximation properties of shallow neural networks whose parameters are required to satisfy some constraints. In particular, we constrain the Lipschitz constant of the flow of the neural ODE to increase the stability of the shallow neural network, and we restrict the norm of the weight matrices of the linear layers to one to make sure that the restricted expansivity of the flow is not compensated by the increased expansivity of the linear layers. For this setting, we prove approximation bounds that tell us the accuracy to which we can approximate a continuous function with a shallow neural network with such constraints. We prove that the UAP holds if we consider only the constraint on the Lipschitz constant of the flow or the unit norm constraint on the weight matrices of the linear layers.