Light-Bound Transformers: Hardware-Anchored Robustness for Silicon-Photonic Computer Vision Systems

📅 2026-04-05
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Deploying Vision Transformers (ViTs) on near-sensor silicon photonic accelerators is highly susceptible to hardware-induced noise and energy efficiency constraints, often resulting in significant accuracy degradation. This work presents the first ViT deployment framework tailored to real-world silicon photonic hardware, achieving a hardware-algorithm co-optimization through several key innovations: empirical noise modeling based on measured microring resonator arrays, an activation-dependent variance proxy model, chance-constrained training (CCT), and a noise-aware LayerNorm design. Notably, the proposed approach recovers near-ideal, noise-free accuracy on actual photonic hardware without requiring on-chip learning or additional optical components, while simultaneously adhering to stringent system energy budgets.
📝 Abstract
Deploying Vision Transformers (ViTs) on near-sensor analog accelerators demands training pipelines that are explicitly aligned with device-level noise and energy constraints. We introduce a compact framework for silicon-photonic execution of ViTs that integrates measured hardware noise, robust attention training, and an energy-aware processing flow. We first characterize bank-level noise in microring-resonator (MR) arrays, including fabrication variation, thermal drift, and amplitude noise, and convert these measurements into closed-form, activation-dependent variance proxies for attention logits and feed-forward activations. Using these proxies, we develop Chance-Constrained Training (CCT), which enforces variance-normalized logit margins to bound attention rank flips, and a noise-aware LayerNorm that stabilizes feature statistics without changing the optical schedule. These components yield a practical ``measure $\rightarrow$ model $\rightarrow$ train $\rightarrow$ run'' pipeline that optimizes accuracy under noise while respecting system energy limits. Hardware-in-the-loop experiments with MR photonic banks show that our approach restores near-clean accuracy under realistic noise budgets, with no in-situ learning or additional optical MACs.
Problem

Research questions and friction points this paper is trying to address.

Vision Transformers
silicon photonics
hardware noise
energy constraints
robustness
Innovation

Methods, ideas, or system contributions that make the work stand out.

Silicon Photonics
Vision Transformers
Hardware-Aware Training
Noise Robustness
Chance-Constrained Training
🔎 Similar Papers
No similar papers found.