Tensor Processing with Homodyne Photonic Integrated Circuits exceeds 1,000 TOPS

šŸ“… 2026-04-20
šŸ“ˆ Citations: 0
✨ Influential: 0
šŸ“„ PDF

career value

225K/year
šŸ¤– AI Summary
This work addresses the energy-efficiency and speed bottlenecks of conventional electronic computing in high-throughput AI tasks by proposing a universal matrix multiplication architecture based on in-phase photonic integrated circuits. The system integrates, for the first time, a 256Ɨ256 computational array on a single chip, leveraging time-division multiplexing and on-chip optical fan-out to reduce the number of modulators from O(N²) to O(N). Combining thin-film lithium niobate high-speed modulators, silicon/silicon-nitride photonic circuits, and wafer-scale packaging, the platform achieves 6–7 bits of computational precision at 120 Gbaud/s, supports channel configurations ranging from 8Ɨ8 to 256Ɨ100, and delivers 1,000–6,000 TOPS throughput with an energy efficiency of 330 TOPS/W. The system successfully deploys the Qwen2.5-0.5B model for accurate token generation.

Technology Category

Application Category

šŸ“ Abstract
High-performance computing underpins modern artificial intelligence (AI), enabling foundation models, real-time inference and perception in autonomous systems, and data-intensive scientific simulations. Recent advances in quantization techniques utilizing low-precision computation without degrading model accuracy, creates new opportunities for analog photonic computing characterized by ultra-high clock rates and low energy consumption. Here we propose and demonstrate a coherent homodyne integrated circuit capable of general matrix multiplication(GEMM) with aggregate throughput that exceeds 1,000 TOPS (tera-operations per second), enabled by massive on-chip optical fanout and parallelism. By leveraging time multiplexing, the required modulator count is reduced from O($N^2$) to O(N), allowing dense integration of record-scale 256 $\times$ 256 homodyne units (each <0.0064 $mm^2$) within a single reticle. We employ wafer-scale fabricated 64 thin-film lithium niobate (TFLN) transmitters (each over 40-GHz bandwidth with propagation loss of 0.2 dB/cm) to encode data and chip-to-chip coupled to Si/SiN computing circuits (64 channels). Our system achieves up to 7-bit computational accuracy across 8 $\times$ 8 parallel channels at record computing clockrate 120 Gbaud/s, and 6-bit statistical accuracy across 256 $\times$ 100 channels at 20-128 Gbaud/s, representing a total throughput of 1,000-6,000 TOPS. Massive parallelism amortizes the optoelectronic (OE) conversion to allow 330-TOPS/W efficiency using foundry-available packaging technology. The system throughput is benchmarked with Qwen2.5-0.5 billion parameter models that generate accurate tokens. High throughput and energy efficiency establish a near-term pathway toward light-based accelerators for large-scale training and low-latency inference from datacenters to edges, accelerating new models toward artificial general intelligence.
Problem

Research questions and friction points this paper is trying to address.

photonic computing
high-throughput
energy efficiency
matrix multiplication
AI accelerators
Innovation

Methods, ideas, or system contributions that make the work stand out.

homodyne photonic integrated circuits
tensor processing
time multiplexing
thin-film lithium niobate (TFLN)
optical matrix multiplication
L
Lian Zhou
Opticore Inc., Berkeley, CA 94704
Kaiwen Xue
Kaiwen Xue
Huawei Cloud Computing Technologies Co., Ltd| The Chinese University of Hong Kong, Shenzhen (CUHKSZ)
Robotics3D VisionEmbodied AIAutonomous Driving
Y
Yun-Jhu Lee
Opticore Inc., Berkeley, CA 94704
C
Chun-Ho Lee
Opticore Inc., Berkeley, CA 94704
Y
Yuan Li
Opticore Inc., Berkeley, CA 94704
K
Kiwon Kwon
Electrical Engineering and Computing Sciences, University of California, Berkeley, CA 94720
W
Weipeng Zhang
Opticore Inc., Berkeley, CA 94704
S
Songlin Zhao
Opticore Inc., Berkeley, CA 94704
J
Jason Moraes
Opticore Inc., Berkeley, CA 94704
N
Niranjan Bhatia
Electrical Engineering and Computing Sciences, University of California, Berkeley, CA 94720
Ryan Hamerly
Ryan Hamerly
MIT
PhotonicsQuantum OpticsNonlinear Optics
Mengjie Yu
Mengjie Yu
Assistant Professor at UC Berkeley, EECS
Nonlinear photonicsnanophotonicsmid-infrared opticsquantum device
Z
Zaijun Chen
Opticore Inc., Berkeley, CA 94704; Electrical Engineering and Computing Sciences, University of California, Berkeley, CA 94720