🤖 AI Summary
Secondary rays in GPU-accelerated ray tracing exhibit poor spatial locality and low SIMT execution efficiency, limiting overall performance.
Method: This paper proposes a hardware-agnostic ray reordering framework centered on an Endpoint-Estimated Key—designed specifically for secondary rays—to enhance spatial coherence with minimal computational overhead. The approach integrates wavefront path tracing scheduling, cooperative utilization of RTX hardware traversal kernels, and GPU memory access locality modeling, without relying on shader-specific implementations.
Contribution/Results: Evaluated on modern GPUs, the method achieves 1.3–2.0× end-to-end ray tracing speedup. Reordering significantly accelerates software-intensive stages (e.g., intersection computation and shading preparation); although hardware traversal incurs inherent overhead, the net performance gain remains unambiguous. This work establishes a novel, portable paradigm for general-purpose GPU ray reordering—enabling efficient, implementation-independent acceleration across diverse ray tracing pipelines.
📝 Abstract
We study ray reordering as a tool for increasing the performance of existing GPU ray tracing implementations. We focus on ray reordering that is fully agnostic to the particular trace kernel. We summarize the existing methods for computing the ray sorting keys and discuss their properties. We propose a novel modification of a previously proposed method using the termination point estimation that is well-suited to tracing secondary rays. We evaluate the ray reordering techniques in the context of the wavefront path tracing using the RTX trace kernels. We show that ray reordering yields significantly higher trace speed on recent GPUs (1.3 − 2.0 ×), but to recover the reordering overhead in the hardware-accelerated trace phase is problematic.