Universality of physical neural networks with multivariate nonlinearity

📅 2025-09-06
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
The universal approximation capability of physical neural networks (PNNs) has long lacked rigorous theoretical validation, hindering their deployment in energy-efficient deep learning hardware. Method: We establish the first universality theorem for PNNs, formally proving their ability to approximate any continuous function to arbitrary precision. Leveraging this foundation, we propose a scalable free-space photonic architecture integrating nonlinear input encoding, time-division multiplexing, and tunable physical parameter control—enabling chip-scale, large-scale integration. Results: Experimental evaluation on image classification tasks demonstrates high accuracy. This work not only fills a fundamental theoretical gap in PNN research but also introduces a novel design paradigm for low-power, high-throughput physical deep learning hardware grounded in rigorous approximation theory.

Technology Category

Application Category

📝 Abstract
The enormous energy demand of artificial intelligence is driving the development of alternative hardware for deep learning. Physical neural networks try to exploit physical systems to perform machine learning more efficiently. In particular, optical systems can calculate with light using negligible energy. While their computational capabilities were long limited by the linearity of optical materials, nonlinear computations have recently been demonstrated through modified input encoding. Despite this breakthrough, our inability to determine if physical neural networks can learn arbitrary relationships between data -- a key requirement for deep learning known as universality -- hinders further progress. Here we present a fundamental theorem that establishes a universality condition for physical neural networks. It provides a powerful mathematical criterion that imposes device constraints, detailing how inputs should be encoded in the tunable parameters of the physical system. Based on this result, we propose a scalable architecture using free-space optics that is provably universal and achieves high accuracy on image classification tasks. Further, by combining the theorem with temporal multiplexing, we present a route to potentially huge effective system sizes in highly practical but poorly scalable on-chip photonic devices. Our theorem and scaling methods apply beyond optical systems and inform the design of a wide class of universal, energy-efficient physical neural networks, justifying further efforts in their development.
Problem

Research questions and friction points this paper is trying to address.

Determining if physical neural networks can learn arbitrary data relationships
Establishing a universality condition for physical neural networks
Overcoming computational limitations of linear optical materials
Innovation

Methods, ideas, or system contributions that make the work stand out.

Universality theorem for physical neural networks
Scalable free-space optics architecture for universality
Temporal multiplexing for large effective system sizes
🔎 Similar Papers
No similar papers found.
B
Benjamin Savinson
Optical Materials Engineering Laboratory, Department of Mechanical and Process Engineering, ETH Zurich, 8092 Zurich, Switzerland; ETH AI Center, ETH Zurich, 8092 Zurich, Switzerland; Seminar for Applied Mathematics, Department of Mathematics, ETH Zurich, 8092 Zurich, Switzerland
D
David J. Norris
Optical Materials Engineering Laboratory, Department of Mechanical and Process Engineering, ETH Zurich, 8092 Zurich, Switzerland; ETH AI Center, ETH Zurich, 8092 Zurich, Switzerland
Siddhartha Mishra
Siddhartha Mishra
Professor of Applied Mathematics, ETH Zurich, Switzerland
Applied MathematicsNumerical AnalysisScientific computingComputational Fluid and Plasma DynamicsApplied PDEs
Samuel Lanthaler
Samuel Lanthaler
University of Vienna
fluid dynamicsnumerical analysispartial differential equationsBayesian data assimilationdeep