🤖 AI Summary
This work addresses the challenge of effective image restoration under severe noise in optical neural networks, which has been hindered by limitations in existing training methodologies. The study proposes the first framework to integrate transfer learning into all-optical computing: a free-space diffractive optical neural network is first pre-trained on a large-scale dataset of 3.45 million generic images and subsequently fine-tuned for specific tasks. This approach substantially enhances denoising performance at extremely low input PSNR levels (<8 dB), achieving output PSNR values exceeding 18 dB—outperforming both Fourier filtering and optically trained networks without pre-training. Moreover, the model demonstrates strong generalization across diverse datasets, including MNIST, ChestMNIST, CIFAR-10, and CelebA, successfully enabling downstream vision applications such as face detection, license plate recognition, and drone localization.
📝 Abstract
Optical neural networks are emerging as powerful machine learning and information processing tools because of their potential advantages in speed and energy efficiency. The training methods of these physical models, however, remain underexplored compared to their digital counterparts and are leading to suboptimal performance. This paper reports a pre-training-driven approach that leads to snapshot image denoising with substantially improved quality. We demonstrated effective free-space optical denoising by a diffractive network optimized by a two-step process including (1) pre-training using a massive dataset of 3.45 million diverse but simple images and (2) fine-tuning with the corresponding task-specific datasets. Compared to conventional Fourier-domain filtering and directly trained diffractive networks, such a transfer learning process exhibited prominent advantages for denoising images degraded by severe noise, peak signal-to-noise ratio (PSNR) below 8 dB, while preserving fine image features and improving the PSNR to above 18 dB. Importantly, the same pre-trained optical network could be consistently fine-tuned to process degraded images from highly diverse styles ranging from handwritten digits (MNIST) and chest X-rays (ChestMNIST) to CIFAR-10 images and human faces (CelebA). We further demonstrated the critical role of our optical denoisers in vision-based applications, including face detection, plate recognition, and localization of UAVs in noisy conditions.