Annotation-guided AoS-to-SoA conversions and GPU offloading with data views in C++

📅 2025-02-23

📈 Citations: 0

✨ Influential: 0

career value

157K/year

🤖 AI Summary

To address the manual selection of AoS (Array of Structures) versus SoA (Structure of Arrays) memory layouts in C++, which compromises both performance and maintainability, this paper proposes a C++ language extension based on attributes and a compiler-level dynamic layout optimization mechanism. Implemented as a Clang plugin, it enables runtime adaptive switching between AoS and SoA, automatically handling data format conversion, result write-back, and GPU offloading—thereby decoupling data structures from computational logic at the algorithmic level. The approach delivers transparent optimization across heterogeneous platforms, including x86 (Intel/ARM) CPUs and NVIDIA Grace-Hopper systems. Evaluated on an SPH physics simulation benchmark, it achieves significant computational speedup while preserving code conciseness and developer productivity. Its novelty lies in compiler-driven, dynamic memory layout decisions and a unified heterogeneous memory view abstraction.

Technology Category

Application Category

📝 Abstract

The C++ programming language provides classes and structs as fundamental modeling entities. Consequently, C++ code tends to favour array-of-structs (AoS) for encoding data sequences, even though structure-of-arrays (SoA) yields better performance for some calculations. We propose a C++ language extension based on attributes that allows developers to guide the compiler in selecting memory arrangements, i.e.~to select the optimal choice between AoS and SoA dynamically depending on both the execution context and algorithm step. The compiler can then automatically convert data into the preferred format prior to the calculations and convert results back afterward. The compiler handles all the complexity of determining which data to convert and how to manage data transformations. Our implementation realises the compiler-extension for the new annotations in Clang and demonstrates their effectiveness through a smoothed particle hydrodynamics (SPH) code, which we evaluate on an Intel CPU, an ARM CPU, and a Grace-Hopper GPU. While the separation of concerns between data structure and operators is elegant and provides performance improvements, the new annotations do not eliminate the need for performance engineering. Instead, they challenge conventional performance wisdom and necessitate rethinking approaches how to write efficient implementations.

Problem

Research questions and friction points this paper is trying to address.

Optimizing memory arrangement in C++

Dynamic AoS-to-SoA conversion

Compiler-guided GPU offloading

Innovation

Methods, ideas, or system contributions that make the work stand out.

C++ annotation-guided memory arrangement

Dynamic AoS-to-SoA conversion optimization

GPU offloading with data views

🔎 Similar Papers

Automatic BLAS Offloading on Unified Memory Architecture: A Study on NVIDIA Grace-Hopper