Policies over Poses: Reinforcement Learning based Distributed Pose-Graph Optimization for Multi-Robot SLAM

📅 2025-10-26
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address the local optima trapping and poor scalability of distributed pose graph optimization (PGO) in multi-robot SLAM, this paper proposes the first scalable PGO framework based on multi-agent reinforcement learning (MARL). We formulate PGO as a partially observable Markov game and introduce a recurrent edge-conditioned graph neural network coupled with an adaptive edge gating mechanism. A memory-augmented hybrid policy enables joint denoising and collaborative optimization, while consensus-based consistency constraints guarantee global convergence. The framework supports cross-scale policy transfer and zero-shot generalization to larger robot teams. Evaluated on multiple standard benchmarks, our method reduces the optimization objective by 37.5% on average and accelerates inference by over 6× compared to state-of-the-art approaches, demonstrating superior accuracy, efficiency, and scalability.

Technology Category

Application Category

📝 Abstract
We consider the distributed pose-graph optimization (PGO) problem, which is fundamental in accurate trajectory estimation in multi-robot simultaneous localization and mapping (SLAM). Conventional iterative approaches linearize a highly non-convex optimization objective, requiring repeated solving of normal equations, which often converge to local minima and thus produce suboptimal estimates. We propose a scalable, outlier-robust distributed planar PGO framework using Multi-Agent Reinforcement Learning (MARL). We cast distributed PGO as a partially observable Markov game defined on local pose-graphs, where each action refines a single edge's pose estimate. A graph partitioner decomposes the global pose graph, and each robot runs a recurrent edge-conditioned Graph Neural Network (GNN) encoder with adaptive edge-gating to denoise noisy edges. Robots sequentially refine poses through a hybrid policy that utilizes prior action memory and graph embeddings. After local graph correction, a consensus scheme reconciles inter-robot disagreements to produce a globally consistent estimate. Our extensive evaluations on a comprehensive suite of synthetic and real-world datasets demonstrate that our learned MARL-based actors reduce the global objective by an average of 37.5% more than the state-of-the-art distributed PGO framework, while enhancing inference efficiency by at least 6X. We also demonstrate that actor replication allows a single learned policy to scale effortlessly to substantially larger robot teams without any retraining. Code is publicly available at https://github.com/herolab-uga/policies-over-poses.
Problem

Research questions and friction points this paper is trying to address.

Solving distributed pose-graph optimization for multi-robot SLAM
Overcoming local minima in non-convex trajectory estimation
Enhancing scalability and robustness in multi-agent systems
Innovation

Methods, ideas, or system contributions that make the work stand out.

Uses Multi-Agent Reinforcement Learning for pose-graph optimization
Employs recurrent edge-conditioned GNN encoder with adaptive gating
Implements hybrid policy with action memory and consensus scheme
🔎 Similar Papers
No similar papers found.
S
Sai Krishna Ghanta
School of Computing, University of Georgia, Athens, GA 30602, USA
Ramviyas Parasuraman
Ramviyas Parasuraman
University of Georgia
RoboticsMulti-Robot SystemsRescue RoboticsNetworked RoboticsSwarm Robotics