🤖 AI Summary
Existing benchmarks struggle to simultaneously evaluate AI accelerator performance and ensure numerical reliability. To address this, we propose HPL-MxP—the first mixed-precision linear algebra benchmark designed for exascale supercomputing. Our method integrates low-precision LU factorization with non-stationary GMRES iterative refinement, eliminating fixed-iteration constraints; introduces diagonal scaling to enhance numerical stability for ill-conditioned problems; and features a scalable, large-scale matrix generator supporting terabyte-scale inputs. Experimental results demonstrate that HPL-MxP achieves exascale-level throughput on mainstream AI accelerators while maintaining controlled backward error. Thus, HPL-MxP establishes a novel evaluation standard for HPC–AI convergence—offering scalability, robustness, and hardware adaptability.
📝 Abstract
We present a mixed-precision benchmark called HPL-MxP that uses both a lower-precision LU factorization with a non-stationary iterative refinement based on GMRES. We evaluate the numerical stability of one of the methods of generating the input matrix in a scalable fashion and show how the diagonal scaling affects the solution quality in terms of the backward-error. Some of the performance results at large scale supercomputing installations produced Exascale-level compute throughput numbers thus proving the viability of the proposed benchmark for evaluating such machines. We also present the potential of the benchmark to continue increasing its use with proliferation of hardware accelerators for AI workloads whose reliable evaluation continues to pose a particular challenge for the users.