đ¤ AI Summary
Astrophysical, cosmological, and space plasma simulation codes face scalability bottlenecks on emerging accelerator-based exascale architectures.
Method: We conducted co-optimization of three core codesâgPLUTO, OpenGadget3, and iPIC3Dâvia interdisciplinary collaboration among domain scientists, software developers, and HPC experts, integrating fine-grained performance analysis and multi-level parallel tuning to achieve end-to-end refactoring on the CINECA Leonardo (EuroHPC) GPU cluster.
Contribution/Results: We developed a portable optimization framework tailored for heterogeneous many-core architectures. The framework achieves an average 80% strong scaling efficiency across 1,024 GPUs, substantially surpassing current parallel scalability limits of astrophysical codes on exascale systems. This work delivers critical software infrastructure enabling large-scale, high-fidelity cosmic simulations.
đ Abstract
Developing and redesigning astrophysical, cosmological, and space plasma numerical codes for existing and next-generation accelerators is critical for enabling large-scale simulations. To address these challenges, the SPACE Center of Excellence (SPACE-CoE) fosters collaboration between scientists, code developers, and high-performance computing experts to optimize applications for the exascale era. This paper presents our strategy and initial results on the Leonardo system at CINECA for three flagship codes, namely gPLUTO, OpenGadget3 and iPIC3D, using profiling tools to analyze performance on single and multiple nodes. Preliminary tests show all three codes scale efficiently, reaching 80% scalability up to 1,024 GPUs.