What Is Next for LLMs? Next-Generation AI Computing Hardware Using Photonic Chips

📅 2025-05-09
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
The escalating computational and energy demands of large language models (LLMs)—e.g., GPT-3’s 1,300 MWh training energy—exacerbate bottlenecks in conventional von Neumann architectures. Method: This work proposes a photonic AI hardware paradigm tailored for long-context Transformer inference, integrating silicon-based Mach–Zehnder interferometer (MZI) meshes, wavelength-division multiplexed microring resonators, high-speed graphene/TMDC electro-optic modulators, spiking neural network circuits, and spin-photon hybrid synapses to enable on-chip photonic mapping of dynamic matrix operations. Contribution/Results: The architecture is projected to surpass state-of-the-art electronic processors by 2–3 orders of magnitude in both throughput and energy efficiency. Crucially, it identifies two key challenges: on-chip storage of long-context states and ultra-large-scale data retention within the photonic chip.

Technology Category

Application Category

📝 Abstract
Large language models (LLMs) are rapidly pushing the limits of contemporary computing hardware. For example, training GPT-3 has been estimated to consume around 1300 MWh of electricity, and projections suggest future models may require city-scale (gigawatt) power budgets. These demands motivate exploration of computing paradigms beyond conventional von Neumann architectures. This review surveys emerging photonic hardware optimized for next-generation generative AI computing. We discuss integrated photonic neural network architectures (e.g., Mach-Zehnder interferometer meshes, lasers, wavelength-multiplexed microring resonators) that perform ultrafast matrix operations. We also examine promising alternative neuromorphic devices, including spiking neural network circuits and hybrid spintronic-photonic synapses, which combine memory and processing. The integration of two-dimensional materials (graphene, TMDCs) into silicon photonic platforms is reviewed for tunable modulators and on-chip synaptic elements. Transformer-based LLM architectures (self-attention and feed-forward layers) are analyzed in this context, identifying strategies and challenges for mapping dynamic matrix multiplications onto these novel hardware substrates. We then dissect the mechanisms of mainstream LLMs, such as ChatGPT, DeepSeek, and LLaMA, highlighting their architectural similarities and differences. We synthesize state-of-the-art components, algorithms, and integration methods, highlighting key advances and open issues in scaling such systems to mega-sized LLM models. We find that photonic computing systems could potentially surpass electronic processors by orders of magnitude in throughput and energy efficiency, but require breakthroughs in memory, especially for long-context windows and long token sequences, and in storage of ultra-large datasets.
Problem

Research questions and friction points this paper is trying to address.

Exploring photonic chips for energy-efficient AI computing.
Developing integrated photonic neural networks for ultrafast matrix operations.
Addressing memory challenges in scaling photonic systems for large LLMs.
Innovation

Methods, ideas, or system contributions that make the work stand out.

Photonic chips enable ultrafast matrix operations
Hybrid spintronic-photonic synapses combine memory and processing
2D materials enhance tunable modulators and synaptic elements
🔎 Similar Papers
No similar papers found.
R
Renjie Li
School of Science and Engineering, Guangdong Key Laboratory of Optoelectronic Materials and Chips, Shenzhen Key Lab of Semiconductor Lasers, The Chinese University of Hong Kong, Shenzhen
Wenjie Wei
Wenjie Wei
University of Electronic Science and Technology of China
Spiking Neural NetworkNeuromorphic ComputingModel CompressionEvent-based Vision
Q
Qi Xin
School of Science and Engineering, Guangdong Key Laboratory of Optoelectronic Materials and Chips, Shenzhen Key Lab of Semiconductor Lasers, The Chinese University of Hong Kong, Shenzhen
X
Xiaoli Liu
University of Electronic Science and Technology of China
S
Sixuan Mao
School of Science and Engineering, Guangdong Key Laboratory of Optoelectronic Materials and Chips, Shenzhen Key Lab of Semiconductor Lasers, The Chinese University of Hong Kong, Shenzhen
E
Erik Ma
University of California, Berkeley
Zijian Chen
Zijian Chen
Shanghai Jiao Tong University | Shanghai AI Laboratory
Image/Video Quality AssessmentLarge Multi-modal Models
M
Malu Zhang
University of Electronic Science and Technology of China
Haizhou Li
Haizhou Li
The Chinese University of Hong Kong, Shenzhen (CUHK-Shenzhen), China; NUS, Singapore
Automatic Speech RecognitionSpeaker RecognitionLanguage RecognitionVoice ConversionMachine Translation
Zhaoyu Zhang
Zhaoyu Zhang
Associate Professor, The Chinese University of Hong Kong, Shenzhen
Optoelectronicssemiconductor lasersorganic light emitting devicesperovskite light emitting devices