Benchmarking and Parallelization of Electrostatic Particle-In-Cell for low-temperature Plasma Simulation by particle-thread Binding

📅 2025-06-26

📈 Citations: 0

✨ Influential: 0

career value

216K/year

🤖 AI Summary

In particle-in-cell (PIC) simulations of low-temperature plasmas, charge deposition (CD) suffers from severe parallel bottlenecks due to frequent particle–mesh interactions—especially in 2D/3D device-scale simulations, where conventional per-core private mesh strategies incur substantial memory redundancy and poor scalability. To address this, we propose a particle–thread binding mechanism: only four private meshes per node are required, achieved via fine-grained thread binding, flag-based synchronization, and a lightweight arbitration function that prevents concurrent particle updates to the same mesh cell. The method preserves standard PIC data structures and requires minimal code modifications. Experimental evaluation on large-scale distributed-memory (thousand-core) and shared-memory systems demonstrates strong scalability of the CD kernel—significantly outperforming traditional approaches—while maintaining low hardware dependency and implementation overhead.

Technology Category

Application Category

📝 Abstract

The Particle-In-Cell (PIC) method for plasma simulation tracks particle phase space information using particle and grid data structures. High computational costs in 2D and 3D device-scale PIC simulations necessitate parallelization, with the Charge Deposition (CD) subroutine often becoming a bottleneck due to frequent particle-grid interactions. Conventional methods mitigate dependencies by generating private grids for each core, but this approach faces scalability issues. We propose a novel approach based on a particle-thread binding strategy that requires only four private grids per node in distributed memory systems or four private grids in shared memory systems, enhancing CD scalability and performance while maintaining conventional data structures and requiring minimal changes to existing PIC codes. This method ensures complete accessibility of grid data structure for concurrent threads and avoids simultaneous access to particles within the same cell using additional functions and flags. Performance evaluations using a PIC benchmark for low-temperature partially magnetized E x B discharge simulation on a shared memory as well as a distributed memory system (1000 cores) demonstrate the method's scalability, and additionally, we show the method has little hardware dependency.

Problem

Research questions and friction points this paper is trying to address.

Addresses high computational costs in 2D/3D plasma PIC simulations

Reduces Charge Deposition bottleneck via particle-thread binding

Improves scalability with minimal changes to existing PIC codes

Innovation

Methods, ideas, or system contributions that make the work stand out.

Particle-thread binding for parallelization

Four private grids per node

Ensures grid accessibility and avoids conflicts

🔎 Similar Papers

Accelerating Particle-in-Cell Monte Carlo simulations with MPI, OpenMP/OpenACC and asynchronous multi-GPU programming

2024-04-16Journal of Computer ScienceCitations: 2

On the performance of two-sided MPI, MPI-3 RMA and SHMEM in a Lagrangian particle cluster algorithm

2024-08-27Citations: 0