A GPU-based Compressible Combustion Solver for Applications Exhibiting Disparate Space and Time Scales

📅 2025-10-27
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Hypersonic reactive flows suffer severe computational bottlenecks due to multiscale coupling—particularly stiffness from chemical kinetics dictating restrictive time steps—while existing GPU-based combustion solvers exhibit suboptimal memory access efficiency, load imbalance, and inadequate handling of localized reactions. This work develops a high-performance compressible combustion solver for multi-GPU architectures within the AMReX framework. We introduce column-major storage to optimize global memory access; develop a batched sparse chemical kinetics integration strategy, extended for the first time to adaptive multigrid environments; and implement mesh-adaptivity-driven dynamic load balancing across GPUs. Guided by the Roofline model, our implementation achieves near-ideal weak scaling on 1–96 NVIDIA H100 GPUs, delivering 2–5× speedup over baseline solvers. Computational intensities of convection and chemistry kernels improve by approximately 10× and 4×, respectively.

Technology Category

Application Category

📝 Abstract
High-speed chemically active flows present significant computational challenges due to their disparate space and time scales, where stiff chemistry often dominates simulation time. While modern supercomputing scientific codes achieve exascale performance by leveraging graphics processing units (GPUs), existing GPU-based compressible combustion solvers face critical limitations in memory management, load balancing, and handling the highly localized nature of chemical reactions. To this end, we present a high-performance compressible reacting flow solver built on the AMReX framework and optimized for multi-GPU settings. Our approach addresses three GPU performance bottlenecks: memory access patterns through column-major storage optimization, computational workload variability via a bulk-sparse integration strategy for chemical kinetics, and multi-GPU load distribution for adaptive mesh refinement applications. The solver adapts existing matrix-based chemical kinetics formulations to multigrid contexts. Using representative combustion applications including hydrogen-air detonations and jet in supersonic crossflow configurations, we demonstrate $2-5 imes$ performance improvements over initial GPU implementations with near-ideal weak scaling across $1-96$ NVIDIA H100 GPUs. Roofline analysis reveals substantial improvements in arithmetic intensity for both convection ($sim 10 imes$) and chemistry ($sim 4 imes$) routines, confirming efficient utilization of GPU memory bandwidth and computational resources.
Problem

Research questions and friction points this paper is trying to address.

Optimizing GPU memory access for compressible combustion simulations
Balancing computational workload in stiff chemical kinetics
Improving multi-GPU load distribution for adaptive meshes
Innovation

Methods, ideas, or system contributions that make the work stand out.

GPU-optimized column-major memory access patterns
Bulk-sparse integration strategy for chemical kinetics
Multi-GPU load balancing with adaptive mesh refinement
🔎 Similar Papers
No similar papers found.