Shallow ReLU neural networks and finite elements

📅 2024-03-09

🏛️ arXiv.org

📈 Citations: 0

✨ Influential: 0

career value

228K/year

🤖 AI Summary

This work investigates the weak representation capability of shallow ReLU neural networks for piecewise linear functions—both continuous and discontinuous—defined on convex polyhedral meshes, and their approximation properties in $L^p$ norms. We propose a compact two-hidden-layer ReLU network construction, establishing for the first time an exact quantitative relationship between the number of neurons and the number of polyhedra and hyperplanes in the mesh. We rigorously bridge ReLU networks with classical finite element methods—including continuous/discontinuous linear elements and tensor-product elements—proving their functional representation equivalence. Leveraging this correspondence, we transfer the $L^p$ approximation analysis framework from finite elements to neural networks, thereby enabling precise quantification of structural requirements (e.g., width, depth, neuron count) for achieving prescribed approximation accuracy. This is the first work to establish a rigorous theoretical connection between shallow ReLU networks and finite element theory, offering a novel, structure-aware perspective on the approximation mechanism of neural networks.

Technology Category

Application Category

📝 Abstract

We point out that (continuous or discontinuous) piecewise linear functions on a convex polytope mesh can be represented by two-hidden-layer ReLU neural networks in a weak sense. In addition, the numbers of neurons of the two hidden layers required to weakly represent are accurately given based on the numbers of polytopes and hyperplanes involved in this mesh. The results naturally hold for constant and linear finite element functions. Such weak representation establishes a bridge between shallow ReLU neural networks and finite element functions, and leads to a perspective for analyzing approximation capability of ReLU neural networks in $L^p$ norm via finite element functions. Moreover, we discuss the strict representation for tensor finite element functions via the recent tensor neural networks.

Problem

Research questions and friction points this paper is trying to address.

Represent piecewise linear functions via shallow ReLU networks

Bridge ReLU networks and finite element functions

Analyze ReLU network approximation in L^p norm

Innovation

Methods, ideas, or system contributions that make the work stand out.

Two-hidden-layer ReLU networks represent piecewise linear functions

Neuron counts based on polytopes and hyperplanes

Tensor networks for strict finite element representation

🔎 Similar Papers

DC is all you need: describing ReLU from a signal processing standpoint