Radiation Hydrodynamics at Scale: Comparing MPI and Asynchronous Many-Task Runtimes with FleCSI

📅 2026-03-05

📈 Citations: 0

✨ Influential: 0

career value

212K/year

🤖 AI Summary

This study addresses the significant challenges posed by large-scale radiation hydrodynamics simulations to efficient distributed programming. Leveraging the FleCSI framework, it presents the first systematic evaluation of weak and strong scaling performance for three parallel runtime systems—MPI, Legion, and HPX—at scales up to 131,072 cores (thousands of nodes), using real scientific applications including a Poisson solver and the HARD code, with comparisons against Kokkos. Experimental results demonstrate that MPI achieves 97% parallel efficiency in communication-intensive scenarios, while HPX outperforms MPI+Kokkos by up to 1.64× on compute-intensive tasks within 64 nodes, highlighting both the practical potential and current limitations of asynchronous task-based runtimes in high-performance scientific computing.

Technology Category

Application Category

📝 Abstract

Writing efficient distributed code remains a labor-intensive and complex endeavor. To simplify application development, the Flexible Computational Science Infrastructure (FleCSI) framework offers a user-oriented, high-level programming interface that is built upon a task-based runtime model. Internally, FleCSI integrates state-of-the-art parallelization backends, including MPI and the asynchronous many-task runtimes (AMTRs) Legion and HPX, enabling applications to fully leverage asynchronous parallelism. In this work, we benchmark two applications using FleCSI's three backends on up to 1024 nodes, intending to quantify the advantages and overheads introduced by the AMTR backends. As representative applications, we select a simple Poisson solver and the multidimensional radiation hydrodynamics code HARD. In the communication-focused Poisson solver benchmark, FleCSI achieves over 97% parallel efficiency using the MPI backend under weak scaling on up to 131072 cores, indicating that only minimal overhead is introduced by its abstraction layer. While the Legion backend exhibits notable overheads and scaling limitations, the HPX backend introduces only marginal overhead compared to MPI+Kokkos. However, the scalability of the HPX backend is currently limited due to the usage of non-optimized HPX collective operations. In the computation-focused radiation hydrodynamics benchmarks, the performance gap between the MPI and HPX backends fades. On fewer than 64 nodes, the HPX backend outperforms MPI+Kokkos, achieving an average speedup of 1.31 under weak scaling and up to 1.27 under strong scaling. For the hydrodynamics-only HARD benchmark, the HPX backend demonstrates superior performance on fewer than 32 nodes, achieving speedups of up to 1.20 relative to MPI and up to 1.64 relative to MPI+Kokkos.

Problem

Research questions and friction points this paper is trying to address.

Radiation Hydrodynamics

Distributed Computing

MPI

Asynchronous Many-Task Runtimes

Scalability

Innovation

Methods, ideas, or system contributions that make the work stand out.

FleCSI

asynchronous many-task runtime

radiation hydrodynamics