Scaling Neural-Network-Based Molecular Dynamics with Long-Range Electrostatic Interactions to 51 Nanoseconds per Day

📅 2025-04-22
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Neural network molecular dynamics (NNMD) suffers from severe performance bottlenecks in systems with long-range electrostatic interactions, primarily due to computational overhead from neural network inference and Ewald summation. To address this, we propose a co-optimization framework tailored for exascale supercomputing platforms. Our approach introduces: (i) hardware-accelerated FFT offloading for efficient long-range force computation; (ii) fine-grained intra-core overlap between neural network inference and long-range force evaluation; (iii) a ring-based atom-level load-balancing scheme to minimize inter-node communication; and (iv) deep integration with the DPLR framework and Fugaku’s heterogeneous architecture. Evaluated on the Fugaku supercomputer, our method achieves up to 37× speedup over baseline NNMD implementations, attaining simulation throughput of 51 ns/day—surpassing prior state-of-the-art for NNMD-based long-range electrostatics. This work establishes a scalable paradigm for high-fidelity, long-timescale simulations of large-scale ionic and polar systems.

Technology Category

Application Category

📝 Abstract
Neural network-based molecular dynamics (NNMD) simulations incorporating long-range electrostatic interactions have significantly extended the applicability to heterogeneous and ionic systems, enabling effective modeling critical physical phenomena such as protein folding and dipolar surface and maintaining ab initio accuracy. However, neural network inference and long-range force computation remain the major bottlenecks, severely limiting simulation speed. In this paper, we target DPLR, a state-of-the-art NNMD package that supports long-range electrostatics, and propose a set of comprehensive optimizations to enhance computational efficiency. We introduce (1) a hardware-offloaded FFT method to reduce the communication overhead; (2) an overlapping strategy that hides long-range force computations using a single core per node, and (3) a ring-based load balancing method that enables atom-level task evenly redistribution with minimal communication overhead. Experimental results on the Fugaku supercomputer show that our work achieves a 37x performance improvement, reaching a maximum simulation speed of 51 ns/day.
Problem

Research questions and friction points this paper is trying to address.

Optimizing NNMD for faster long-range electrostatic simulations
Reducing bottlenecks in neural network inference and force computation
Enhancing DPLR package efficiency for large-scale molecular dynamics
Innovation

Methods, ideas, or system contributions that make the work stand out.

Hardware-offloaded FFT reduces communication overhead
Overlapping strategy hides long-range force computations
Ring-based load balancing enables atom-level redistribution
🔎 Similar Papers
No similar papers found.
J
Jianxiong Li
SKLP, Institute of Computing Technology, Chinese Academy of Sciences, University of Chinese Academy of Sciences, Beijing, China
B
Beining Zhang
SKLP, Institute of Computing Technology, Chinese Academy of Sciences, University of Chinese Academy of Sciences, Beijing, China
M
Mingzhen Li
SKLP, Institute of Computing Technology, Chinese Academy of Sciences, University of Chinese Academy of Sciences, Beijing, China
Siyu Hu
Siyu Hu
Institute of Computing Technology, Chinese Academy of Sciences
AI4SHPC
Jinzhe Zeng
Jinzhe Zeng
University of Science and Technology of China
AI for Computational Chemistrymachine learning potentialsmolecular dynamics
L
Lijun Liu
Department of Mechanical Engineering, Graduate School of Engineering, The University of Osaka, Suita, Japan
G
Guojun Yuan
SKLP, Institute of Computing Technology, Chinese Academy of Sciences, University of Chinese Academy of Sciences, Beijing, China
Z
Zhan Wang
SKLP, Institute of Computing Technology, Chinese Academy of Sciences, University of Chinese Academy of Sciences, Beijing, China
G
Guangming Tan
SKLP, Institute of Computing Technology, Chinese Academy of Sciences, University of Chinese Academy of Sciences, Beijing, China
W
Weile Jia
SKLP, Institute of Computing Technology, Chinese Academy of Sciences, University of Chinese Academy of Sciences, Beijing, China