Implementing True MPI Sessions and Evaluating MPI Initialization Scalability

πŸ“… 2026-05-05
πŸ“ˆ Citations: 0
✨ Influential: 0
πŸ“„ PDF

career value

228K/year
πŸ€– AI Summary
This work addresses the scalability bottleneck in MPI initialization caused by the reliance on the global communicator MPI_COMM_WORLD in traditional MPI implementations, particularly at exascale. By rearchitecting MPICH’s internal design, the authors present the first mainstream MPI implementation that fully decouples MPI_COMM_WORLD and introduces a compliant, true Sessions model aligned with the MPI-4 standard. The proposed approach employs explicit hierarchical process-set management and a scalable initialization protocol, substantially improving startup scalability. Experimental results demonstrate that the new mechanism efficiently supports exascale-class supercomputing systems, establishing a critical foundation for deploying MPI on next-generation ultra-large-scale platforms.
πŸ“ Abstract
Sessions is one of the major features introduced in the MPI-4 standard. It offers an alternative to the traditional world communicator model by allowing applications to construct communicators from process sets, thereby eliminating the dependency on MPI_COMM_WORLD. The Sessions model was proposed as a more scalable solution for exascale systems, where MPI_COMM_WORLD was viewed as a potential scalability bottleneck. However, supporting Sessions is a significant challenge for established codebases like MPICH due to the deep integration of the world model in traditional MPI implementations. Although MPICH added support for the MPI-4 standard upon its release, it still internally relied on a global world communicator. This approach enabled applications written using the Sessions model to function, but it did not fulfill the full design intent of Sessions, which meant to decouple MPI from MPI_COMM_WORLD. We describe MPICH effort to support true MPI Sessions, including a major internal refactoring. We describe the architectural changes required to support true Sessions and evaluate the resulting implementation scalability. Our results demonstrate that true Sessions can offer significant scalability benefits by adopting explicit hierarchical designs.
Problem

Research questions and friction points this paper is trying to address.

MPI Sessions
MPI_COMM_WORLD
scalability
exascale systems
MPI-4
Innovation

Methods, ideas, or system contributions that make the work stand out.

MPI Sessions
scalability
MPI_COMM_WORLD
hierarchical design
MPICH
πŸ”Ž Similar Papers
No similar papers found.