DINO-SLAM: DINO-informed RGB-D SLAM for Neural Implicit and Explicit Representations

📅 2025-07-25
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address the limited robustness of neural implicit (NeRF) and explicit (3D Gaussian Splatting, 3DGS) representations in RGB-D SLAM—stemming from insufficient scene understanding—this paper proposes DINO-SLAM. Our method pioneers the integration of DINO visual features with a hierarchical Scene Structure Encoder (SSE) to construct a semantics-geometry co-enhanced representation. We design a unified framework enabling both NeRF- and 3DGS-based SLAM paradigms to share DINO feature inputs and undergo joint optimization. Furthermore, we introduce an Enhanced DINO (EDINO) feature integration mechanism to achieve cross-representation feature alignment and gradient cooperation. Evaluated on Replica, ScanNet, and TUM datasets, DINO-SLAM significantly improves pose estimation accuracy and map completeness, consistently outperforming state-of-the-art SLAM methods across all benchmarks.

Technology Category

Application Category

📝 Abstract
This paper presents DINO-SLAM, a DINO-informed design strategy to enhance neural implicit (Neural Radiance Field -- NeRF) and explicit representations (3D Gaussian Splatting -- 3DGS) in SLAM systems through more comprehensive scene representations. Purposely, we rely on a Scene Structure Encoder (SSE) that enriches DINO features into Enhanced DINO ones (EDINO) to capture hierarchical scene elements and their structural relationships. Building upon it, we propose two foundational paradigms for NeRF and 3DGS SLAM systems integrating EDINO features. Our DINO-informed pipelines achieve superior performance on the Replica, ScanNet, and TUM compared to state-of-the-art methods.
Problem

Research questions and friction points this paper is trying to address.

Enhance neural implicit and explicit SLAM representations
Improve scene structure with DINO-informed features
Achieve superior performance in SLAM systems
Innovation

Methods, ideas, or system contributions that make the work stand out.

Enhances SLAM with DINO-informed NeRF and 3DGS
Uses Scene Structure Encoder for EDINO features
Achieves superior performance on benchmark datasets
🔎 Similar Papers
No similar papers found.