GS-Cache: A GS-Cache Inference Framework for Large-scale Gaussian Splatting Models

📅 2025-02-20

📈 Citations: 0

✨ Influential: 0

🤖 AI Summary

Rendering large-scale 3D Gaussian Splatting (3DGS) models in real time on consumer-grade devices for immersive applications (e.g., VR) faces conflicting demands of high fidelity, ultra-low latency, and severe hardware resource constraints. To address this, we propose an end-to-end inference framework featuring a novel cache-centralized rendering pipeline and an efficiency-aware, elastic multi-GPU scheduler—enabling tight co-optimization across system and representation levels. Our approach integrates 3DGS representation compression, custom CUDA kernels, hierarchical GPU memory caching, and dynamic load-balancing algorithms. Experiments demonstrate that our method achieves a 5.35× speedup over baseline implementations, reduces end-to-end latency by 35%, and cuts GPU memory consumption by 42%. It robustly supports high-quality stereo 2K rendering at over 120 FPS on commodity hardware.

Technology Category

Application Category

📝 Abstract

Rendering large-scale 3D Gaussian Splatting (3DGS) model faces significant challenges in achieving real-time, high-fidelity performance on consumer-grade devices. Fully realizing the potential of 3DGS in applications such as virtual reality (VR) requires addressing critical system-level challenges to support real-time, immersive experiences. We propose GS-Cache, an end-to-end framework that seamlessly integrates 3DGS's advanced representation with a highly optimized rendering system. GS-Cache introduces a cache-centric pipeline to eliminate redundant computations, an efficiency-aware scheduler for elastic multi-GPU rendering, and optimized CUDA kernels to overcome computational bottlenecks. This synergy between 3DGS and system design enables GS-Cache to achieve up to 5.35x performance improvement, 35% latency reduction, and 42% lower GPU memory usage, supporting 2K binocular rendering at over 120 FPS with high visual quality. By bridging the gap between 3DGS's representation power and the demands of VR systems, GS-Cache establishes a scalable and efficient framework for real-time neural rendering in immersive environments.

Problem

Research questions and friction points this paper is trying to address.

Real-time 3D Gaussian Splatting rendering

Optimizing multi-GPU performance

Reducing latency and GPU memory usage

Innovation

Methods, ideas, or system contributions that make the work stand out.

Cache-centric pipeline design

Efficiency-aware GPU scheduler

Optimized CUDA kernels

🔎 Similar Papers

No similar papers found.

Authors to Follow