Light-Bound Transformers: Hardware-Anchored Robustness for Silicon-Photonic Computer Vision Systems

📅 2026-04-05

📈 Citations: 0

✨ Influential: 0

career value

239K/year

🤖 AI Summary

Deploying Vision Transformers (ViTs) on near-sensor silicon photonic accelerators is highly susceptible to hardware-induced noise and energy efficiency constraints, often resulting in significant accuracy degradation. This work presents the first ViT deployment framework tailored to real-world silicon photonic hardware, achieving a hardware-algorithm co-optimization through several key innovations: empirical noise modeling based on measured microring resonator arrays, an activation-dependent variance proxy model, chance-constrained training (CCT), and a noise-aware LayerNorm design. Notably, the proposed approach recovers near-ideal, noise-free accuracy on actual photonic hardware without requiring on-chip learning or additional optical components, while simultaneously adhering to stringent system energy budgets.

Technology Category

Application Category

📝 Abstract

Deploying Vision Transformers (ViTs) on near-sensor analog accelerators demands training pipelines that are explicitly aligned with device-level noise and energy constraints. We introduce a compact framework for silicon-photonic execution of ViTs that integrates measured hardware noise, robust attention training, and an energy-aware processing flow. We first characterize bank-level noise in microring-resonator (MR) arrays, including fabrication variation, thermal drift, and amplitude noise, and convert these measurements into closed-form, activation-dependent variance proxies for attention logits and feed-forward activations. Using these proxies, we develop Chance-Constrained Training (CCT), which enforces variance-normalized logit margins to bound attention rank flips, and a noise-aware LayerNorm that stabilizes feature statistics without changing the optical schedule. These components yield a practical ``measure $\rightarrow$ model $\rightarrow$ train $\rightarrow$ run'' pipeline that optimizes accuracy under noise while respecting system energy limits. Hardware-in-the-loop experiments with MR photonic banks show that our approach restores near-clean accuracy under realistic noise budgets, with no in-situ learning or additional optical MACs.

Problem

Research questions and friction points this paper is trying to address.

Vision Transformers

silicon photonics

hardware noise

energy constraints

robustness

Innovation

Methods, ideas, or system contributions that make the work stand out.

Silicon Photonics

Vision Transformers

Hardware-Aware Training