🤖 AI Summary
To address the low energy efficiency, poor scalability, and insufficient real-time performance of DNN deployment on edge devices, this paper proposes FAST-ONN—a free-space optical in-memory computing architecture. It employs a VCSEL array for high-speed optical input modulation and a high-resolution spatial light modulator for on-chip parallel weighted summation, enabling signed-weight differential readout, on-chip backpropagation training, and photonic-level reconfigurability. The system achieves a clock rate beyond 1 GHz, convolutional throughput of 100 million frames per second (for YOLO feature extraction), and microsecond-scale inference latency. Its energy efficiency exceeds that of state-of-the-art electronic ASICs by over two orders of magnitude. Leveraging a three-dimensional free-space optical design, FAST-ONN offers inherent scalability, establishing a new paradigm for edge intelligence characterized by high throughput, ultra-low power consumption, and hardware-native trainability.
📝 Abstract
The ability to process and act on data in real time is increasingly critical for applications ranging from autonomous vehicles, three-dimensional environmental sensing and remote robotics. However, the deployment of deep neural networks (DNNs) in edge devices is hindered by the lack of energy-efficient scalable computing hardware. Here, we introduce a fanout spatial time-of-flight optical neural network (FAST-ONN) that calculates billions of convolutions per second with ultralow latency and power consumption. This is enabled by the combination of high-speed dense arrays of vertical-cavity surface-emitting lasers (VCSELs) for input modulation with spatial light modulators of high pixel counts for in-memory weighting. In a three-dimensional optical system, parallel differential readout allows signed weight values accurate inference in a single shot. The performance is benchmarked with feature extraction in You-Only-Look-Once (YOLO) for convolution at 100 million frames per second (MFPS), and in-system backward propagation training with photonic reprogrammability. The VCSEL transmitters are implementable in any free-space optical computing systems to improve the clockrate to over gigahertz. The high scalability in device counts and channel parallelism enables a new avenue to scale up free space computing hardware.