A Survey of Neural Network Variational Monte Carlo from a Computing Workload Characterization Perspective

📅 2026-03-18
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Neural-network-based variational Monte Carlo (NNVMC) methods face significant challenges in efficient GPU deployment due to high computational and memory overheads. This work presents the first systematic workload-level analysis of NNVMC’s heterogeneous computing characteristics on GPUs. Employing a unified performance evaluation protocol that integrates GPU hardware performance counters, Roofline modeling, and arithmetic intensity analysis, we conduct an end-to-end empirical assessment of four representative models: PauliNet, FermiNet, Psiformer, and Orbformer. Our study reveals substantial compute-memory imbalance across different execution phases, with performance bottlenecks primarily stemming from low-arithmetic-intensity element-wise operations and frequent data movement. These findings provide critical insights for phase-aware scheduling, memory-centric optimizations, and hardware-software co-design strategies tailored to NNVMC workloads.

Technology Category

Application Category

📝 Abstract
Neural Network Variational Monte Carlo (NNVMC) has emerged as a promising paradigm for solving quantum many-body problems by combining variational Monte Carlo with expressive neural-network wave-function ansätze. Although NNVMC can achieve competitive accuracy with favorable asymptotic scaling, practical deployment remains limited by high runtime and memory cost on modern graphics processing units (GPUs). Compared with language and vision workloads, NNVMC execution is shaped by physics-specific stages, including Markov-Chain Monte Carlo sampling, wave-function construction, and derivative/Laplacian evaluation, which produce heterogeneous kernel behavior and nontrivial bottlenecks. This paper provides a workload-oriented survey and empirical GPU characterization of four representative ansätze: PauliNet, FermiNet, Psiformer, and Orbformer. Using a unified profiling protocol, we analyze model-level runtime and memory trends and kernel-level behavior through family breakdown, arithmetic intensity, roofline positioning, and hardware utilization counters. The results show that end-to-end performance is often constrained by low-intensity elementwise and data-movement kernels, while the compute/memory balance varies substantially across ansätze and stages. Based on these findings, we discuss algorithm--hardware co-design implications for scalable NNVMC systems, including phase-aware scheduling, memory-centric optimization, and heterogeneous acceleration.
Problem

Research questions and friction points this paper is trying to address.

Neural Network Variational Monte Carlo
quantum many-body problems
GPU performance
workload characterization
computational bottlenecks
Innovation

Methods, ideas, or system contributions that make the work stand out.

Neural Network Variational Monte Carlo
GPU workload characterization
algorithm-hardware co-design
quantum many-body problems
performance bottlenecks
🔎 Similar Papers
No similar papers found.
Z
Zhengze Xiao
Department of Computer Science and Engineering, Hong Kong University of Science and Technology
X
Xuanzhe Ding
Department of Computer Science and Engineering, Hong Kong University of Science and Technology
Y
Yuyang Lou
Department of Chemistry, Hong Kong University of Science and Technology
Lixue Cheng
Lixue Cheng
Assistant Professor, Hong Kong University of Science and Technology
AI4ScienceElectronic StructureMachine LearningTheoretical ChemistryQuantum Chemistry
Chaojian Li
Chaojian Li
Hong Kong University of Science and Technology
Efficient AIHardware / software codesign