π€ AI Summary
3D Gaussian Splatting (3DGS) struggles with memory-bound deployment on large-scale scenes, especially on single-consumer GPUs.
Method: This paper proposes a CPU-GPU collaborative rendering framework tailored for single-GPU systems. Its core innovation is a dynamic Gaussian offloading strategy guided by access-pattern prediction, which migrates inactive Gaussians to CPU memory. To minimize data migration overhead, it employs computation-communication pipelining and fine-grained scheduling. Crucially, the method preserves the original 3DGS representation and training pipelineβno modifications are required.
Results: Evaluated on an RTX 4090, the framework efficiently renders scenes containing up to 100 million Gaussians, achieving state-of-the-art reconstruction quality while reducing GPU memory consumption by 67%. It enables practical deployment of 3DGS on resource-constrained hardware without compromising fidelity or compatibility.
π Abstract
3D Gaussian Splatting (3DGS) is an increasingly popular novel view synthesis approach due to its fast rendering time, and high-quality output. However, scaling 3DGS to large (or intricate) scenes is challenging due to its large memory requirement, which exceed most GPU's memory capacity. In this paper, we describe CLM, a system that allows 3DGS to render large scenes using a single consumer-grade GPU, e.g., RTX4090. It does so by offloading Gaussians to CPU memory, and loading them into GPU memory only when necessary. To reduce performance and communication overheads, CLM uses a novel offloading strategy that exploits observations about 3DGS's memory access pattern for pipelining, and thus overlap GPU-to-CPU communication, GPU computation and CPU computation. Furthermore, we also exploit observation about the access pattern to reduce communication volume. Our evaluation shows that the resulting implementation can render a large scene that requires 100 million Gaussians on a single RTX4090 and achieve state-of-the-art reconstruction quality.