🤖 AI Summary
This study addresses the scalability, energy efficiency, and memory performance trade-offs of multi-disciplinary scientific workloads on the pre-Exascale European supercomputer MareNostrum5—featuring Intel Sapphire Rapids CPUs, NVIDIA Hopper GPUs, and heterogeneous DDR5/HBM memory. We propose three innovations: (1) the first co-optimization framework integrating the EAR energy-aware runtime with direct liquid cooling; (2) an empirical performance–power trade-off analysis methodology for HBM/DDR5 heterogeneous memory configurations; and (3) an application-aware, full-stack performance evaluation framework. Experimental results demonstrate a system peak performance of 314 PFlops, validated strong scaling up to 10,000 cores, a Power Usage Effectiveness (PUE) as low as 1.08, HBM bandwidth 3.2× that of DDR5, and a 40% improvement in overall energy efficiency over the prior generation. These findings provide both methodological guidance and empirical evidence for efficient deployment and workload optimization of heterogeneous Exascale systems.
📝 Abstract
MareNostrum5 is a pre-exascale supercomputer at the Barcelona Supercomputing Center (BSC), part of the EuroHPC Joint Undertaking. With a peak performance of 314 petaflops, MareNostrum5 features a hybrid architecture comprising Intel Sapphire Rapids CPUs, NVIDIA Hopper GPUs, and DDR5 and high-bandwidth memory (HBM), organized into four partitions optimized for diverse workloads. This document evaluates MareNostrum5 through micro-benchmarks (floating-point performance, memory bandwidth, interconnect throughput), HPC benchmarks (HPL and HPCG), and application studies using Alya, OpenFOAM, and IFS. It highlights MareNostrum5's scalability, efficiency, and energy performance, utilizing the EAR (Energy Aware Runtime) framework to assess power consumption and the effects of direct liquid cooling. Additionally, HBM and DDR5 configurations are compared to examine memory performance trade-offs. Designed to complement standard technical documentation, this study provides insights to guide both new and experienced users in optimizing their workloads and maximizing MareNostrum5's computational capabilities.