A Portable Multi-GPU Solver for Collisional Plasmas with Coulombic Interactions

📅 2025-08-08
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the computational bottleneck in multiscale particle dynamics simulations of low-temperature plasmas (LTPs). We propose a parallel kinetic-fluid hybrid Particle-in-Cell (PIC) method tailored for heterogeneous GPU clusters: it employs a kinetic model for electrons and a fluid approximation for heavy species, implemented within a portable PyKokkos-based programming framework supporting both CUDA and HIP backends, and integrated with MPI for distributed-memory parallelism. Our key contribution lies in algorithm–hardware co-design, yielding highly efficient and portable PIC kernels optimized for both NVIDIA V100 and AMD MI250X GPUs—most kernels achieve superior performance on MI250X, with strong scalability up to 16 MPI processes. This work delivers an open-source, cross-architecture solution enabling high-fidelity, multi-GPU LTP simulations.

Technology Category

Application Category

📝 Abstract
We study parallel particle-in-cell (PIC) methods for low-temperature plasmas (LTPs), which discretize kinetic formulations that capture the time evolution of the probability density function of particles as a function of position and velocity. We use a kinetic description for electrons and a fluid approximation for heavy species. In this paper, we focus on GPU acceleration of algorithms for velocity-space interactions and in particular, collisions of electrons with neutrals, ions, and electrons. Our work has two thrusts. The first is algorithmic exploration and analysis. The second is examining the viability of rapid-prototyping implementations using Python-based HPC tools, in particular PyKokkos. We discuss several common PIC kernels and present performance results on NVIDIA Volta V100 and AMD MI250X GPUs. Overall, the MI250X is slightly faster for most kernels but shows more sensitivity to register pressure. We also report scaling results for a distributed memory implementation on up to 16 MPI ranks.
Problem

Research questions and friction points this paper is trying to address.

Develop portable multi-GPU solver for collisional plasmas
Accelerate electron-neutral/ion/electron collision algorithms
Evaluate Python-based HPC tools for rapid prototyping
Innovation

Methods, ideas, or system contributions that make the work stand out.

Multi-GPU parallel particle-in-cell methods
Python-based HPC tools for rapid-prototyping
Distributed memory implementation scaling