๐ค AI Summary
Optical neural networks (ONNs) suffer significant performance degradation after digital training due to physical noise and fabrication imperfections. To address this modelโhardware mismatch, we propose Gradient-Informed Fine-Tuning (GIFT), a lightweight, in-situ deployable, hardware-aware fine-tuning method that leverages gradient information characterizing noise structure to optimize parameters without retraining. GIFT bridges digital pretraining with gradient-guided in-situ adaptation, operating on feedforward ONN architectures. We theoretically derive conditions under which GIFT guarantees performance improvement. Evaluated on MNIST classification, GIFT achieves up to 28% relative accuracy gain over baseline models on a five-layer ONN. The method effectively mitigates hardware-induced performance loss while imposing minimal computational and calibration overhead, offering an efficient, low-cost calibration paradigm toward practical ONN deployment.
๐ Abstract
Optical Neural Networks (ONNs) promise significant advantages over traditional electronic neural networks, including ultrafast computation, high bandwidth, and low energy consumption, by leveraging the intrinsic capabilities of photonics. However, training ONNs poses unique challenges, notably the reliance on simplified in silico models whose trained parameters must subsequently be mapped to physical hardware. This process often introduces inaccuracies due to discrepancies between the idealized digital model and the physical ONN implementation, particularly stemming from noise and fabrication imperfections.
In this paper, we analyze how noise misspecification during in silico training impacts ONN performance and we introduce Gradient-Informed Fine-Tuning (GIFT), a lightweight algorithm designed to mitigate this performance degradation. GIFT uses gradient information derived from the noise structure of the ONN to adapt pretrained parameters directly in situ, without requiring expensive retraining or complex experimental setups. GIFT comes with formal conditions under which it improves ONN performance.
We also demonstrate the effectiveness of GIFT via simulation on a five-layer feed forward ONN trained on the MNIST digit classification task. GIFT achieves up to $28%$ relative accuracy improvement compared to the baseline performance under noise misspecification, without resorting to costly retraining. Overall, GIFT provides a practical solution for bridging the gap between simplified digital models and real-world ONN implementations.