Multi-Partner Project: Multi-GPU Performance Portability Analysis for CFD Simulations at Scale

📅 2026-01-20
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This study addresses the challenge of performance portability in computational fluid dynamics (CFD) on heterogeneous supercomputing architectures. Focusing on the SOD2D spectral element CFD framework from the REFMAP project, it presents the first systematic evaluation of cross-vendor performance portability between AMD and NVIDIA multi-GPU platforms in the context of urban wind flow prediction. A full-stack analysis spanning application, software, and hardware layers is conducted, leveraging vendor-specific compiler stacks and the LUMI multi-GPU cluster for optimization and scalability experiments. Results reveal significant performance disparities—single-GPU optimizations yield speedups ranging from 0.69× to 3.91×, while multi-GPU throughput exhibits substantial variability—highlighting the limitations of current performance prediction models and underscoring the necessity of holistic, multi-level co-optimization.

Technology Category

Application Category

📝 Abstract
As heterogeneous supercomputing architectures leveraging GPUs become increasingly central to high-performance computing (HPC), it is crucial for computational fluid dynamics (CFD) simulations, a de-facto HPC workload, to efficiently utilize such hardware. One of the key challenges of HPC codes is performance portability, i.e. the ability to maintain near-optimal performance across different accelerators. In the context of the \textbf{REFMAP} project, which targets scalable, GPU-enabled multi-fidelity CFD for urban airflow prediction, this paper analyzes the performance portability of SOD2D, a state-of-the-art Spectral Elements simulation framework across AMD and NVIDIA GPU architectures. We first discuss the physical and numerical models underlying SOD2D, highlighting its computational hotspots. Then, we examine its performance and scalability in a multi-level manner, i.e. defining and characterizing an extensive full-stack design space spanning across application, software and hardware infrastructure related parameters. Single-GPU performance characterization across server-grade NVIDIA and AMD GPU architectures and vendor-specific compiler stacks, show the potential as well as the diverse effect of memory access optimizations, i.e. 0.69$\times$ - 3.91$\times$ deviations in acceleration speedup. Performance variability of SOD2D at scale is further examined on the LUMI multi-GPU cluster, where profiling reveals similar throughput variations, highlighting the limits of performance projections and the need for multi-level, informed tuning.
Problem

Research questions and friction points this paper is trying to address.

performance portability
computational fluid dynamics
GPU architectures
heterogeneous computing
CFD simulations
Innovation

Methods, ideas, or system contributions that make the work stand out.

performance portability
multi-GPU
CFD simulation
spectral elements
heterogeneous HPC
🔎 Similar Papers
No similar papers found.
P
Panagiotis-Eleftherios Eleftherakis
National Technical University of Athens, Greece
G
George Anagnostopoulos
National Technical University of Athens, Greece
A
Anastassis Kapetanakis
National Technical University of Athens, Greece
M
Mohammad Umair
KTH Royal Institute of Technology, Sweden
J
Jean-Yves Vet
Hewlett Packard Enterprise (HPE), France
K
Konstantinos Iliakis
National Technical University of Athens, Greece
Jonathan Vincent
Jonathan Vincent
Graduate student, Universite de Sherbrooke
Artificial IntelligenceDeep learning
J
Jing Gong
KTH Royal Institute of Technology, Sweden
Akshay Patil
Akshay Patil
Facebook
Data MiningSocial NetworksAlgorithms
C
Clara Garc'ia-S'anchez
Technical University of Delft, Netherlands
G
Gerardo Zampino
KTH Royal Institute of Technology, Sweden
Ricardo Vinuesa
Ricardo Vinuesa
Associate Professor, University of Michigan
Artificial IntelligenceSimulationTurbulent boundary layersFlow controlSustainability
Sotirios Xydis
Sotirios Xydis
Assistant Professor, School of ECE, National Technical University of Athens
Computer Hardware and ArchitectureEnergy-Aware ComputingHigh Level SynthesisDesign Space ExplorationArithmetic Circuits