MCN-SLAM: Multi-Agent Collaborative Neural SLAM with Hybrid Implicit Neural Scene Representation

📅 2025-06-23
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Existing neural implicit SLAM methods are limited to single-agent, small-scale scenes, and short sequences; multi-agent NeRF-SLAM frameworks struggle to satisfy communication bandwidth constraints, and no real-world multi-agent dataset provides both continuous-time ground-truth trajectories and high-fidelity 3D mesh ground truth. This paper proposes the first distributed multi-agent collaborative neural SLAM framework tailored for communication-constrained environments. It innovatively integrates a triplane-mesh joint implicit representation, an intra-to-inter loop closure detection mechanism, and online knowledge distillation–driven multi-subgraph fusion. The framework enables long-sequence dense mapping and high-precision localization in large-scale scenes. Experiments demonstrate superior performance over state-of-the-art methods in mapping accuracy, pose estimation, and communication efficiency. Additionally, we release DES—the first real-world multi-agent dense SLAM dataset featuring high-accuracy continuous-time trajectory and 3D mesh ground truth.

Technology Category

Application Category

📝 Abstract
Neural implicit scene representations have recently shown promising results in dense visual SLAM. However, existing implicit SLAM algorithms are constrained to single-agent scenarios, and fall difficulties in large-scale scenes and long sequences. Existing NeRF-based multi-agent SLAM frameworks cannot meet the constraints of communication bandwidth. To this end, we propose the first distributed multi-agent collaborative neural SLAM framework with hybrid scene representation, distributed camera tracking, intra-to-inter loop closure, and online distillation for multiple submap fusion. A novel triplane-grid joint scene representation method is proposed to improve scene reconstruction. A novel intra-to-inter loop closure method is designed to achieve local (single-agent) and global (multi-agent) consistency. We also design a novel online distillation method to fuse the information of different submaps to achieve global consistency. Furthermore, to the best of our knowledge, there is no real-world dataset for NeRF-based/GS-based SLAM that provides both continuous-time trajectories groundtruth and high-accuracy 3D meshes groundtruth. To this end, we propose the first real-world Dense slam (DES) dataset covering both single-agent and multi-agent scenarios, ranging from small rooms to large-scale outdoor scenes, with high-accuracy ground truth for both 3D mesh and continuous-time camera trajectory. This dataset can advance the development of the research in both SLAM, 3D reconstruction, and visual foundation model. Experiments on various datasets demonstrate the superiority of the proposed method in both mapping, tracking, and communication. The dataset and code will open-source on https://github.com/dtc111111/mcnslam.
Problem

Research questions and friction points this paper is trying to address.

Extends neural SLAM to multi-agent collaborative scenarios
Addresses large-scale scene and long sequence challenges
Solves communication bandwidth constraints in multi-agent SLAM
Innovation

Methods, ideas, or system contributions that make the work stand out.

Hybrid implicit neural scene representation
Distributed multi-agent camera tracking
Online distillation for submap fusion
🔎 Similar Papers
No similar papers found.
Tianchen Deng
Tianchen Deng
Shanghai Jiao Tong University
RoboticsComputer Vision
G
Guole Shen
Institute of Medical Robotics and Department of Automation, Shanghai Jiao Tong University, and Key Laboratory of System Control and Information Processing, Ministry of Education, Shanghai 200240, China
X
Xun Chen
School of Electrical and Electronic Engineering, Nanyang Technological University, Singapore
S
Shenghai Yuan
School of Electrical and Electronic Engineering, Nanyang Technological University, Singapore
Hongming Shen
Hongming Shen
Nanyang Technological University
SLAMSensor FusionAerial Robotics
Guohao Peng
Guohao Peng
Nanyang Technological University, Singapore
Computer VisionRobotics PerceptionArtificial IntelligenceSLAM
Z
Zhenyu Wu
School of Electrical and Electronic Engineering, Nanyang Technological University, Singapore
J
Jingchuan Wang
Institute of Medical Robotics and Department of Automation, Shanghai Jiao Tong University, and Key Laboratory of System Control and Information Processing, Ministry of Education, Shanghai 200240, China
Lihua Xie
Lihua Xie
Professor of Electrical Engineering, Nanyang Technological University
Robust controlNetworked ControlMult-agent Systems
Danwei Wang
Danwei Wang
Professor, Nanyang Technological University
RoboticsControl EngineeringFault Diagnosis
H
Hesheng Wang
Institute of Medical Robotics and Department of Automation, Shanghai Jiao Tong University, and Key Laboratory of System Control and Information Processing, Ministry of Education, Shanghai 200240, China
W
Weidong Chen
Institute of Medical Robotics and Department of Automation, Shanghai Jiao Tong University, and Key Laboratory of System Control and Information Processing, Ministry of Education, Shanghai 200240, China