UP-SLAM: Adaptively Structured Gaussian SLAM with Uncertainty Prediction in Dynamic Environments

📅 2025-05-28
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address the poor real-time performance and weak robustness of 3D Gaussian Splatting (3DGS)-based SLAM in dynamic scenes, this paper proposes a decoupled RGB-D SLAM framework. First, we introduce a novel unsupervised, open-set, pixel-wise motion uncertainty estimator that fuses multi-modal residuals and DINO features—requiring no training. Second, we design a probabilistic octree-driven adaptive Gaussian structural representation, enabling dynamic primitive insertion/deletion and reusable static map construction. Third, we develop a parallelized tracking-mapping architecture incorporating temporal encoding and shallow MLP-based feature distillation. Experiments demonstrate a 59.8% improvement in localization accuracy and a 4.57 dB gain in rendering PSNR, while maintaining real-time operation and generating artifact-free static maps.

Technology Category

Application Category

📝 Abstract
Recent 3D Gaussian Splatting (3DGS) techniques for Visual Simultaneous Localization and Mapping (SLAM) have significantly progressed in tracking and high-fidelity mapping. However, their sequential optimization framework and sensitivity to dynamic objects limit real-time performance and robustness in real-world scenarios. We present UP-SLAM, a real-time RGB-D SLAM system for dynamic environments that decouples tracking and mapping through a parallelized framework. A probabilistic octree is employed to manage Gaussian primitives adaptively, enabling efficient initialization and pruning without hand-crafted thresholds. To robustly filter dynamic regions during tracking, we propose a training-free uncertainty estimator that fuses multi-modal residuals to estimate per-pixel motion uncertainty, achieving open-set dynamic object handling without reliance on semantic labels. Furthermore, a temporal encoder is designed to enhance rendering quality. Concurrently, low-dimensional features are efficiently transformed via a shallow multilayer perceptron to construct DINO features, which are then employed to enrich the Gaussian field and improve the robustness of uncertainty prediction. Extensive experiments on multiple challenging datasets suggest that UP-SLAM outperforms state-of-the-art methods in both localization accuracy (by 59.8%) and rendering quality (by 4.57 dB PSNR), while maintaining real-time performance and producing reusable, artifact-free static maps in dynamic environments.The project: https://aczheng-cai.github.io/up_slam.github.io/
Problem

Research questions and friction points this paper is trying to address.

Enhances real-time RGB-D SLAM in dynamic environments
Improves robustness against dynamic objects without semantic labels
Boosts localization accuracy and rendering quality simultaneously
Innovation

Methods, ideas, or system contributions that make the work stand out.

Parallelized framework decouples tracking and mapping
Probabilistic octree adaptively manages Gaussian primitives
Training-free uncertainty estimator filters dynamic regions
🔎 Similar Papers
No similar papers found.