Evaluating the impact of the L3 cache size of AMD EPYC CPUs on the performance of CFD applications

📅 2025-05-23
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This study investigates the impact of L3 cache capacity (256–1152 MB) across multiple generations of AMD EPYC processors (Rome to Bergamo) on memory-bound CFD applications. Method: Using two representative OpenFOAM cases—motorBike and Urban Air Pollution—we introduce FVOPS (Finite Volumes solved Per Second) for cross-architecture performance normalization and perform in-depth cache and memory-channel-level analysis via AMD uProf. Contribution/Results: We first identify a localized performance peak arising from the matching between L3 cache size and computational mesh granularity. We establish a quantitative analytical framework capturing synergistic effects of L2/L3 cache capacities, bandwidths, and memory channel configurations. Results reveal nonlinear L3 cache benefits and enable identification of optimal cache-to-core configurations per CPU generation. Experiments show Genoa X achieves up to 18% higher FVOPS than Genoa on large meshes, empirically validating the efficacy of cache-aware optimization strategies.

Technology Category

Application Category

📝 Abstract
In this work, the authors focus on assessing the impact of the AMD EPYC processor architecture on the performance of CFD applications. Several generations of architectures were analyzed, such as Rome, Milan, Milan X, Genoa, Genoa X and Bergamo, characterized by a different number of cores (64-128), L3 cache size (256 - 1152 MB) and RAM type (8-channel DDR4 or 12-channel DDR5). The research was conducted based on the OpenFOAM application using two memory-bound models: motorBike and Urban Air Pollution. In order to compare the performance of applications on different architectures, the FVOPS (Finite VOlumes solved Per Second) metric was introduced, which allows a direct comparison of the performance on the different architectures. It was noticed that local maximum performance occurs in the grid sizes assigned to the processing process, which is related to individual processor attributes. Additionally, the behavior of the models was analyzed in detail using the software profiling analysis tool AMD uProf to reveal the applications' interaction with the hardware. It enabled fine-tuned monitoring of the CPU's behaviours and identified potential inefficiencies in AMD EPYC CPUs. Particular attention was paid to the effective use of L2 and L3 cache memory in the context of their capacity and the bandwidth of memory channels, which are a key factor in memory-bound applications. Processor features were analyzed from a cross-platform perspective, which allowed for the determination of metrics of particular importance in terms of their impact on the performance achieved by CFD applications.
Problem

Research questions and friction points this paper is trying to address.

Assessing AMD EPYC L3 cache impact on CFD performance
Comparing CPU architectures using FVOPS metric
Analyzing cache usage efficiency in memory-bound applications
Innovation

Methods, ideas, or system contributions that make the work stand out.

Analyzed AMD EPYC CPUs with varying L3 cache sizes
Introduced FVOPS metric for cross-architecture performance comparison
Used AMD uProf for detailed CPU behavior profiling
🔎 Similar Papers
No similar papers found.
M
Marcin Lawenda
Poznan Supercomputing and Networking Center, Jana Pawła II 10, 61-139 Poznań, Poland
Lukasz Szustak
Lukasz Szustak
PhD, Assistant Professor, Czestochowa University of Technology
L
László Környei
Széchenyi István Egyetem-University of Győr, Győr Egyetem tér 1. tanulmányi ép. B-604, Hungary
F
F. C. C. Galeazzo
High Performance Computing Center Stuttgart (HLRS), University of Stuttgart, Nobelstraße 19, 70569 Stuttgart, Germany
P
Pawel Bratek
Czestochowa University of Technology, Dąbrowskiego 69, 42-201 Częstochowa, Poland