CLM: Removing the GPU Memory Barrier for 3D Gaussian Splatting

📅 2025-11-07

📈 Citations: 0

✨ Influential: 0

career value

252K/year

🤖 AI Summary

3D Gaussian Splatting (3DGS) struggles with memory-bound deployment on large-scale scenes, especially on single-consumer GPUs. Method: This paper proposes a CPU-GPU collaborative rendering framework tailored for single-GPU systems. Its core innovation is a dynamic Gaussian offloading strategy guided by access-pattern prediction, which migrates inactive Gaussians to CPU memory. To minimize data migration overhead, it employs computation-communication pipelining and fine-grained scheduling. Crucially, the method preserves the original 3DGS representation and training pipeline—no modifications are required. Results: Evaluated on an RTX 4090, the framework efficiently renders scenes containing up to 100 million Gaussians, achieving state-of-the-art reconstruction quality while reducing GPU memory consumption by 67%. It enables practical deployment of 3DGS on resource-constrained hardware without compromising fidelity or compatibility.

Technology Category

Application Category

📝 Abstract

3D Gaussian Splatting (3DGS) is an increasingly popular novel view synthesis approach due to its fast rendering time, and high-quality output. However, scaling 3DGS to large (or intricate) scenes is challenging due to its large memory requirement, which exceed most GPU's memory capacity. In this paper, we describe CLM, a system that allows 3DGS to render large scenes using a single consumer-grade GPU, e.g., RTX4090. It does so by offloading Gaussians to CPU memory, and loading them into GPU memory only when necessary. To reduce performance and communication overheads, CLM uses a novel offloading strategy that exploits observations about 3DGS's memory access pattern for pipelining, and thus overlap GPU-to-CPU communication, GPU computation and CPU computation. Furthermore, we also exploit observation about the access pattern to reduce communication volume. Our evaluation shows that the resulting implementation can render a large scene that requires 100 million Gaussians on a single RTX4090 and achieve state-of-the-art reconstruction quality.

Problem

Research questions and friction points this paper is trying to address.

Reducing GPU memory requirements for 3D Gaussian Splatting

Enabling large-scale scene rendering on consumer GPUs

Optimizing memory access patterns to minimize communication overhead

Innovation

Methods, ideas, or system contributions that make the work stand out.

Offloading Gaussians to CPU memory

Pipelining GPU-CPU communication and computation

Reducing communication volume via access patterns

🔎 Similar Papers

A Survey on 3D Gaussian Splatting