A Portable Multi-GPU Solver for Collisional Plasmas with Coulombic Interactions

📅 2025-08-08

📈 Citations: 0

✨ Influential: 0

career value

197K/year

🤖 AI Summary

This work addresses the computational bottleneck in multiscale particle dynamics simulations of low-temperature plasmas (LTPs). We propose a parallel kinetic-fluid hybrid Particle-in-Cell (PIC) method tailored for heterogeneous GPU clusters: it employs a kinetic model for electrons and a fluid approximation for heavy species, implemented within a portable PyKokkos-based programming framework supporting both CUDA and HIP backends, and integrated with MPI for distributed-memory parallelism. Our key contribution lies in algorithm–hardware co-design, yielding highly efficient and portable PIC kernels optimized for both NVIDIA V100 and AMD MI250X GPUs—most kernels achieve superior performance on MI250X, with strong scalability up to 16 MPI processes. This work delivers an open-source, cross-architecture solution enabling high-fidelity, multi-GPU LTP simulations.

Technology Category

Application Category

📝 Abstract

We study parallel particle-in-cell (PIC) methods for low-temperature plasmas (LTPs), which discretize kinetic formulations that capture the time evolution of the probability density function of particles as a function of position and velocity. We use a kinetic description for electrons and a fluid approximation for heavy species. In this paper, we focus on GPU acceleration of algorithms for velocity-space interactions and in particular, collisions of electrons with neutrals, ions, and electrons. Our work has two thrusts. The first is algorithmic exploration and analysis. The second is examining the viability of rapid-prototyping implementations using Python-based HPC tools, in particular PyKokkos. We discuss several common PIC kernels and present performance results on NVIDIA Volta V100 and AMD MI250X GPUs. Overall, the MI250X is slightly faster for most kernels but shows more sensitivity to register pressure. We also report scaling results for a distributed memory implementation on up to 16 MPI ranks.

Problem

Research questions and friction points this paper is trying to address.

Develop portable multi-GPU solver for collisional plasmas

Accelerate electron-neutral/ion/electron collision algorithms

Evaluate Python-based HPC tools for rapid prototyping

Innovation

Methods, ideas, or system contributions that make the work stand out.

Multi-GPU parallel particle-in-cell methods

Python-based HPC tools for rapid-prototyping

Distributed memory implementation scaling

🔎 Similar Papers

Accelerating Particle-in-Cell Monte Carlo simulations with MPI, OpenMP/OpenACC and asynchronous multi-GPU programming