Neo: Real-Time On-Device 3D Gaussian Splatting with Reuse-and-Update Sorting Acceleration

📅 2025-11-16
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Real-time rendering of 3D Gaussian Splatting (3DGS) on resource-constrained devices is bottlenecked by the sorting stage, which imposes excessive memory bandwidth demand. Method: We observe strong temporal coherence in Gaussian ordering across frames and propose an incremental depth-order tracking technique. We introduce a novel “reuse–update” sorting algorithm that avoids full-frame re-sorting, and design a dedicated hardware accelerator optimizing dataflow and memory access patterns. Contribution/Results: Our approach achieves 10.0× and 5.6× throughput improvements over edge GPU and ASIC baselines, respectively, while reducing DRAM traffic by 94.5% and 81.3%. It significantly enhances energy efficiency and real-time performance. This work is the first to systematically exploit temporal redundancy in 3DGS sorting optimization, establishing a new efficient software–hardware co-design paradigm for AR/VR edge rendering.

Technology Category

Application Category

📝 Abstract
3D Gaussian Splatting (3DGS) rendering in real-time on resource-constrained devices is essential for delivering immersive augmented and virtual reality (AR/VR) experiences. However, existing solutions struggle to achieve high frame rates, especially for high-resolution rendering. Our analysis identifies the sorting stage in the 3DGS rendering pipeline as the major bottleneck due to its high memory bandwidth demand. This paper presents Neo, which introduces a reuse-and-update sorting algorithm that exploits temporal redundancy in Gaussian ordering across consecutive frames, and devises a hardware accelerator optimized for this algorithm. By efficiently tracking and updating Gaussian depth ordering instead of re-sorting from scratch, Neo significantly reduces redundant computations and memory bandwidth pressure. Experimental results show that Neo achieves up to 10.0x and 5.6x higher throughput than state-of-the-art edge GPU and ASIC solution, respectively, while reducing DRAM traffic by 94.5% and 81.3%. These improvements make high-quality and low-latency on-device 3D rendering more practical.
Problem

Research questions and friction points this paper is trying to address.

Achieving real-time 3D Gaussian Splatting rendering on resource-limited AR/VR devices
Overcoming the sorting stage bottleneck in 3DGS pipeline due to high memory bandwidth
Reducing redundant computations and memory pressure for high-quality on-device rendering
Innovation

Methods, ideas, or system contributions that make the work stand out.

Reuse-and-update sorting algorithm reduces redundant computations
Hardware accelerator optimized for Gaussian depth ordering
Temporal redundancy exploitation cuts memory bandwidth pressure
🔎 Similar Papers
No similar papers found.