VSS Challenge Problem: Verifying the Correctness of AllReduce Algorithms in the MPICH Implementation of MPI

📅 2025-10-15
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Existing AllReduce implementations in MPICH must semantically match the canonical “Reduce-then-Broadcast” specification, yet their diverse, highly concurrent designs impede rigorous correctness assurance. Method: The authors extract and refactor three representative AllReduce algorithms directly from MPICH source code into standalone, analyzable models; they then perform end-to-end formal verification—using the concurrent intermediate verification language CIVL—on two of them (Bruck and Recursive Doubling). Contribution: This work presents the first machine-checkable correctness proofs for core MPICH AllReduce algorithms, demonstrating the feasibility of applying formal methods to verify large-scale MPI communication primitives. It establishes a methodological foundation and practical blueprint for the trustworthy evolution of high-performance computing libraries. (132 words)

Technology Category

Application Category

📝 Abstract
We describe a challenge problem for verification based on the MPICH implementation of MPI. The MPICH implementation includes several algorithms for allreduce, all of which should be functionally equivalent to reduce followed by broadcast. We created standalone versions of three algorithms and verified two of them using CIVL.
Problem

Research questions and friction points this paper is trying to address.

Verifying correctness of AllReduce algorithms in MPICH
Ensuring functional equivalence to reduce-broadcast operations
Creating standalone versions for formal verification with CIVL
Innovation

Methods, ideas, or system contributions that make the work stand out.

Verifying MPICH AllReduce algorithms with CIVL
Creating standalone versions for equivalence testing
Ensuring functional equivalence to reduce-broadcast
🔎 Similar Papers
No similar papers found.