🤖 AI Summary
3D Gaussian Splatting achieves only 7–17 FPS on edge GPUs (e.g., Jetson Orin NX), falling far short of the ≥60 FPS requirement for real-time AR/VR rendering.
Method: We propose a hardware-software co-design solution: (i) a plug-and-play hardware accelerator unit, the Gaussian Blending Unit (GBU), tailored for Gaussian rendering and compatible with edge GPUs; (ii) an in-row sequential shading (IRSS) dataflow coupled with multi-Gaussian aggregation scheduling; and (iii) a GPU–GBU collaborative execution framework leveraging two-stage coordinate transformation and Gaussian contribution aggregation.
Results: Our approach achieves end-to-end >60 FPS real-time rendering on Jetson Orin NX—1.72× faster than pure GPU execution for static scenes—while preserving state-of-the-art visual quality. The unified architecture supports diverse AR/VR applications without compromising fidelity or latency.
📝 Abstract
The rapidly advancing field of Augmented and Virtual Reality (AR/VR) demands real-time, photorealistic rendering on resource-constrained platforms. 3D Gaussian Splatting, delivering state-of-the-art (SOTA) performance in rendering efficiency and quality, has emerged as a promising solution across a broad spectrum of AR/VR applications. However, despite its effectiveness on high-end GPUs, it struggles on edge systems like the Jetson Orin NX Edge GPU, achieving only 7-17 FPS -- well below the over 60 FPS standard required for truly immersive AR/VR experiences. Addressing this challenge, we perform a comprehensive analysis of Gaussian-based AR/VR applications and identify the Gaussian Blending Stage, which intensively calculates each Gaussian's contribution at every pixel, as the primary bottleneck. In response, we propose a Gaussian Blending Unit (GBU), an edge GPU plug-in module for real-time rendering in AR/VR applications. Notably, our GBU can be seamlessly integrated into conventional edge GPUs and collaboratively supports a wide range of AR/VR applications. Specifically, GBU incorporates an intra-row sequential shading (IRSS) dataflow that shades each row of pixels sequentially from left to right, utilizing a two-step coordinate transformation. When directly deployed on a GPU, the proposed dataflow achieved a non-trivial 1.72x speedup on real-world static scenes, though still falls short of real-time rendering performance. Recognizing the limited compute utilization in the GPU-based implementation, GBU enhances rendering speed with a dedicated rendering engine that balances the workload across rows by aggregating computations from multiple Gaussians. Experiments across representative AR/VR applications demonstrate that our GBU provides a unified solution for on-device real-time rendering while maintaining SOTA rendering quality.