🤖 AI Summary
This work addresses the high computational cost of deep learning by proposing and implementing a novel paradigm of physics-based stochastic neural networks. Leveraging single-electron tunneling devices and controllable beam-splitter-type single-photon sources, the authors construct the first single-electron and single-photon stochastic neurons, which directly perform learning and inference at the physical level by encoding information in quantum dot charge states and photon mode occupations, respectively. Combined with a gradient estimation training strategy based on true probabilities and empirical outputs, the approach demonstrates remarkable robustness under substantial noise and model uncertainty. On the MNIST classification task, it achieves over 97% test accuracy with only minimal sampling, highlighting its efficiency and resilience.
📝 Abstract
The computational demands of deep learning motivate the investigation of alternative approaches to computation. One alternative is physical neural networks~(PNNs), in which learning and inference are performed directly via physical processes. Stochastic PNNs arise when the underlying neurons are realized by the dynamics of a stochastic activation switch. Here we propose novel electronic and photonic stochastic neurons. The electronic realization is implemented by single-electron tunneling through a quantum dot. The photonic realization is implemented via a single-photon source driving one of two modes coupled via a controllable beam-splitter-like interaction. In the electronic case, the charge state of the quantum dot forms the basis for the stochastic neuron, whereas in the photonic case the occupation of the undriven mode serves as the basis for the stochastic neuron. Training of stochastic PNNs is performed with models of stochastic neurons, as well as with coherently-driven, single-photon detector stochastic neurons previously introduced. Several training strategies for MNIST handwritten digit classification have been investigated using single-hidden-layer stochastic PNNs, including varying the number of trials in each layer to control forward pass stochasticity and employing either true probability or empirical outputs in the backward pass to evaluate their influence on gradient estimation. We show that when empirical outputs are used in the backward pass, the network achieves more than 97\% test accuracy with few trials per layer. Despite the simplicity of the model architecture, high test accuracy is maintained in the presence of a high degree of noise and model uncertainty. The results demonstrate the potential of embracing stochastic PNNs for deep learning.