π€ AI Summary
BVH performance is highly sensitive to data layout, yet existing systems tightly couple layout design with traversal logic, constraining layout optimization within algorithm-specific implementations and hindering simultaneous performance gains and portability. This paper introduces Scionβa domain-specific language and compiler that decouples BVH layout specification from traversal algorithms, enabling architecture-agnostic layout declarations and automatic optimization. Its core contribution is the first full abstraction of BVH layout coupled with cross-platform joint optimization, uncovering a novel layout that achieves Pareto optimality across diverse ray-tracing workloads. Experiments demonstrate that the optimal layout varies dynamically with traversal algorithm, hardware architecture, and workload characteristics. Scion-generated layouts consistently deliver Pareto-superior trade-offs between performance and memory footprint across mainstream CPUs and GPUs compared to conventional hand-tuned layouts.
π Abstract
Bounding volume hierarchies are ubiquitous acceleration structures in graphics, scientific computing, and data analytics. Their performance depends critically on data layout choices that affect cache utilization, memory bandwidth, and vectorization -- increasingly dominant factors in modern computing. Yet, in most programming systems, these layout choices are hopelessly entangled with the traversal logic. This entanglement prevents developers from independently optimizing data layouts and algorithms across different contexts, perpetuating a false dichotomy between performance and portability. We introduce Scion, a domain-specific language and compiler for specifying the data layouts of bounding volume hierarchies independent of tree traversal algorithms. We show that Scion can express a broad spectrum of layout optimizations used in high performance computing while remaining architecture-agnostic. We demonstrate empirically that Pareto-optimal layouts (along performance and memory footprint axes) vary across algorithms, architectures, and workload characteristics. Through systematic design exploration, we also identify a novel ray tracing layout that combines optimization techniques from prior work, achieving Pareto-optimality across diverse architectures and scenes.