Building an Accelerated OpenFOAM Proof-of-Concept Application using Modern C++

📅 2025-07-24
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
OpenFOAM lacks standardized heterogeneous programming support for high-performance computing (HPC), hindering seamless GPU acceleration. Method: This paper proposes a GPU-accelerated implementation of the laplacianFoam solver using modern ISO C++17/20 parallel algorithms (e.g., `std::transform_reduce`, `std::for_each`) and the standard parallel execution policy `std::execution::par_unseq`. Leveraging only the NVIDIA HPC SDK compiler—without CUDA, HIP, or other vendor-specific APIs—it enables automatic offloading of the core Laplacian operator to NVIDIA GPUs. Contribution/Results: This work presents the first demonstration of pure standard C++ heterogeneous programming in OpenFOAM, achieving a unified CPU/GPU codebase and multi-backend portability. Experimental evaluation shows 3.2–4.8× speedup for critical kernels, significantly improving numerical throughput. The approach establishes a novel paradigm for evolving CFD frameworks toward standardized, portable heterogeneous computing.

Technology Category

Application Category

📝 Abstract
The modern trend in High-Performance Computing (HPC) involves the use of accelerators such as Graphics Processing Units (GPUs) alongside Central Processing Units (CPUs) to speed up numerical operations in various applications. Leading manufacturers such as NVIDIA, Intel, and AMD are constantly advancing these architectures, augmenting them with features such as mixed precision, enhanced memory hierarchies, and specialised accelerator silicon blocks (e.g., Tensor Cores on GPU or AMX/SME engines on CPU) to enhance compute performance. At the same time, significant efforts in software development are aimed at optimizing the use of these innovations, seeking to improve usability and accessibility. This work contributes to the state-of-the-art of OpenFOAM development by presenting a working Proof-Of-Concept application built using modern ISO C++ parallel constructs. This approach, combined with an appropriate compiler runtime stack, like the one provided by the NVIDIA HPC SDK, makes it possible to accelerate well-defined kernels, allowing multi-core execution and GPU offloading using a single codebase. The study demonstrates that it is possible to increase the performance of the OpenFOAM laplacianFoam application by offloading the computations on NVIDIA GPUs using the C++ parallel construct.
Problem

Research questions and friction points this paper is trying to address.

Accelerating OpenFOAM using modern C++ for HPC
Enabling GPU offloading in OpenFOAM via C++ parallel constructs
Improving laplacianFoam performance with NVIDIA GPU computation
Innovation

Methods, ideas, or system contributions that make the work stand out.

Uses modern ISO C++ parallel constructs
Combines with NVIDIA HPC SDK runtime
Enables GPU offloading with single codebase
🔎 Similar Papers
No similar papers found.
G
Giulio Malenza
Department of Computer Science, University of Torino, Corso Svizzera 185, 10149 Torino, Italy
G
Giovanni Stabile
Biorobotics Institute, Sant’Anna School of Advanced Studies, Viale Rinaldo Piaggio 34, 56025 Pisa, Italy
Filippo Spiga
Filippo Spiga
NVIDIA Ltd
High Performance Computing
Robert Birke
Robert Birke
Università degli Studi Di Torino
Marco Aldinucci
Marco Aldinucci
Full Professor in Computer Science, University of Torino
Parallel programming modelsparallel programmingRuntime SystemsHPCcloud engineering