🤖 AI Summary
This work addresses the challenges posed by dynamic scenes, memory constraints, and periodic boundary conditions in fixed-radius nearest neighbor (FRNN) searches for particle physics simulations. To overcome these limitations, the authors propose an efficient RT Core–accelerated FRNN method that incorporates real-time BVH update and reconstruction strategies, eliminates reliance on traditional neighbor lists, and introduces novel support for periodic boundary conditions directly within the RT Core pipeline. This approach significantly broadens the applicability and energy efficiency of hardware-accelerated FRNN. Experimental results on the Lennard-Jones model demonstrate up to a 3.4× speedup in the RT Core pipeline and a 1.3–2.0× improvement in per-step simulation performance, while successfully enabling large-scale, non-uniform particle systems that previously failed due to insufficient memory.
📝 Abstract
In this work we introduce three ideas that can further improve particle FRNN physics simulations running on RT Cores; i) a real-time update/rebuild ratio optimizer for the bounding volume hierarchy (BVH) structure, ii) a new RT core use, with two variants, that eliminates the need of a neighbor list and iii) a technique that enables RT cores for FRNN with periodic boundary conditions (BC). Experimental evaluation using the Lennard-Jones FRNN interaction model as a case study shows that the proposed update/rebuild ratio optimizer is capable of adapting to the different dynamics that emerge during a simulation, leading to a RT core pipeline up to $\sim 3.4\times$ faster than with other known approaches to manage the BVH. In terms of simulation step performance, the proposed variants can significantly improve the speedup and energy efficiency (EE) of the base RT core idea; from $\sim1.3\times$ at small radius to $\sim2.0\times$ for log normal radius distributions. Furthermore, the proposed variants manage to simulate cases that would otherwise not fit in memory because of the use of neighbor lists, such as clusters of particles with log normal radius distribution. The proposed RT Core technique to support periodic BC is indeed effective as it does not introduce any significant penalty in performance. In terms of scaling, the proposed methods scale both their performance and EE across GPU generations. Throughout the experimental evaluation, we also identify the simulation cases were regular GPU computation should still be preferred, contributing to the understanding of the strengths and limitations of RT cores.