GSta: Efficient Training Scheme with Siestaed Gaussians for Monocular 3D Scene Reconstruction

📅 2025-04-09
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address the slow training speed and high GPU memory and storage overhead of Gaussian Splatting (GS) in monocular 3D scene reconstruction, this paper proposes a dynamic “sleeping” mechanism. Our method introduces an adaptive convergence criterion based on position and color gradient norms to freeze stabilized Gaussians during training, and integrates PSNR-driven subset early stopping with coordinated learning rate scheduling. To our knowledge, this is the first approach enabling dynamic Gaussian pruning *during* training. It achieves Pareto-optimal improvements—5× faster training, 50% reduction in peak GPU memory, and 16× lower disk storage—while preserving reconstruction accuracy and rendering quality. The proposed framework significantly enhances the practicality of GS for resource-constrained scenarios, such as robotic edge deployment, and establishes a new paradigm for real-time, lightweight neural radiance field modeling.

Technology Category

Application Category

📝 Abstract
Gaussian Splatting (GS) is a popular approach for 3D reconstruction, mostly due to its ability to converge reasonably fast, faithfully represent the scene and render (novel) views in a fast fashion. However, it suffers from large storage and memory requirements, and its training speed still lags behind the hash-grid based radiance field approaches (e.g. Instant-NGP), which makes it especially difficult to deploy them in robotics scenarios, where 3D reconstruction is crucial for accurate operation. In this paper, we propose GSta that dynamically identifies Gaussians that have converged well during training, based on their positional and color gradient norms. By forcing such Gaussians into a siesta and stopping their updates (freezing) during training, we improve training speed with competitive accuracy compared to state of the art. We also propose an early stopping mechanism based on the PSNR values computed on a subset of training images. Combined with other improvements, such as integrating a learning rate scheduler, GSta achieves an improved Pareto front in convergence speed, memory and storage requirements, while preserving quality. We also show that GSta can improve other methods and complement orthogonal approaches in efficiency improvement; once combined with Trick-GS, GSta achieves up to 5x faster training, 16x smaller disk size compared to vanilla GS, while having comparable accuracy and consuming only half the peak memory. More visualisations are available at https://anilarmagan.github.io/SRUK-GSta.
Problem

Research questions and friction points this paper is trying to address.

Reduces storage and memory in Gaussian Splatting for 3D reconstruction
Improves training speed while maintaining competitive accuracy
Enables efficient deployment in robotics via dynamic Gaussian freezing
Innovation

Methods, ideas, or system contributions that make the work stand out.

Dynamic Gaussian freezing for faster training
Early stopping based on PSNR values
Integrated learning rate scheduler optimization
A
Anil Armagan
Samsung R&D Institute UK (SRUK)
Albert Saà-Garriga
Albert Saà-Garriga
Principal Research Engenieer at Samsung Electonics
Parallel ComputingComputer VisionSource to Source Compilers
B
Bruno Manganelli
Samsung R&D Institute UK (SRUK)
K
Kyuwon Kim
Samsung Electronics
M
M. K. Yucel
Samsung R&D Institute UK (SRUK)