SW-TNC : Reaching the Most Complex Random Quantum Circuit via Tensor Network Contraction

📅 2025-04-12
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Addressing the classical simulation challenge of ultra-large-scale random quantum circuits (e.g., “Zuchongzhi” 60×24), this work presents the first high-performance tensor network contraction simulator tailored for the Sunway many-core architecture, scaling to thousands of nodes. Methodologically, it introduces three novel techniques: multi-core cooperative step fusion, on-chip vectorized permutation, and a split-K tensor contraction operator—overcoming the traditional bottlenecks of excessive slicing overhead, poor data locality, and low computational intensity. The design achieves full-stack co-optimization across tensor contraction algorithms, hardware architecture, memory layout, and in-kernel vectorization. Evaluated on 399,360 cores (1,024 Sunway nodes), the simulator delivers over 10× speedup versus prior state-of-the-art, establishing a new record for classical simulation of the most complex random quantum circuits to date.

Technology Category

Application Category

📝 Abstract
Classical simulation is essential in quantum algorithm development and quantum device verification. With the increasing complexity and diversity of quantum circuit structures, existing classical simulation algorithms need to be improved and extended. In this work, we propose novel strategies for tensor network contraction based simulator on Sunway architecture. Our approach addresses three main aspects: complexity, computational paradigms and fine-grained optimization. Data reuse schemes are designed to reduce floating-point operations, and memory organization techniques are employed to eliminate slicing overhead while maintaining parallelism. Step fusion strategy is extended by multi-core cooperation to improve the data locality and computation intensity. Fine-grained optimizations, such as in-kernel vectorized permutations, and split-K operators, are developed as well to address the challenges in new hotspot distribution and topological structure. These innovations can accelerate the simulation of the Zuchongzhi-60-24 by more than 10 times, using more than 1024 Sunway nodes (399,360 cores). Our work demonstrates the potential for enabling efficient classical simulation of increasingly complex quantum circuits.
Problem

Research questions and friction points this paper is trying to address.

Enhancing classical simulation for complex quantum circuits
Optimizing tensor network contraction on Sunway architecture
Reducing computational overhead in quantum device verification
Innovation

Methods, ideas, or system contributions that make the work stand out.

Data reuse schemes reduce floating-point operations
Memory organization eliminates slicing overhead
Fine-grained optimizations improve hotspot distribution
🔎 Similar Papers
No similar papers found.
Y
Yaojian Chen
Tsinghua University, China
Z
Zhaoqi Sun
Zhengzhou University, China
C
Chengyu Qiu
Tsinghua University, China
Z
Zegang Li
Tsinghua University, China
Y
Yanfei Liu
National Supercomputing Center in Wuxi, China
Lin Gan
Lin Gan
Tsinghua University
X
Xiaohui Duan
Shandong University, China
Guangwen Yang
Guangwen Yang
Professor of Computer Science and Technology, Tsinghua University