🤖 AI Summary
Foundational models for nanophotonics are hindered by the scarcity of large-scale, diverse training data. To address this, we propose MOCLIP—the first foundational model for nanophotonic inverse design—leveraging contrastive learning to align metasurface geometries and spectral responses into a shared latent space, enabling zero-shot cross-modal inference. Our method integrates deep neural networks, experimental-data-driven modeling, and latent-space optimization, trained and aligned on an ImageNet-scale experimental dataset. MOCLIP supports high-throughput inverse design and generative optimization: it achieves zero-shot prediction at 200,000 samples per second and attains 97% accuracy in latent-space optimization. Notably, MOCLIP is the first foundational model extended to ultra-high-density optical information storage, achieving a storage density of 0.1 Gbit/mm²—sixfold higher than current commercial media.
📝 Abstract
Foundation models (FM) are transforming artificial intelligence by enabling generalizable, data-efficient solutions across different domains for a broad range of applications. However, the lack of large and diverse datasets limits the development of FM in nanophotonics. This work presents MOCLIP (Metasurface Optics Contrastive Learning Pretrained), a nanophotonic foundation model that integrates metasurface geometry and spectra within a shared latent space. MOCLIP employs contrastive learning to align geometry and spectral representations using an experimentally acquired dataset with a sample density comparable to ImageNet-1K. The study demonstrates MOCLIP inverse design capabilities for high-throughput zero-shot prediction at a rate of 0.2 million samples per second, enabling the design of a full 4-inch wafer populated with high-density metasurfaces in minutes. It also shows generative latent-space optimization reaching 97 percent accuracy. Finally, we introduce an optical information storage concept that uses MOCLIP to achieve a density of 0.1 Gbit per square millimeter at the resolution limit, exceeding commercial optical media by a factor of six. These results position MOCLIP as a scalable and versatile platform for next-generation photonic design and data-driven applications.