Latent Space Reinforcement Learning for Multi-Robot Exploration

📅 2026-01-03

🏛️ arXiv.org

📈 Citations: 0

✨ Influential: 0

career value

222K/year

🤖 AI Summary

This work addresses key challenges in multi-robot cooperative exploration, including high-dimensional inputs, poor scalability, and insufficient robustness under communication constraints. To overcome these limitations, the authors propose a hierarchical deep reinforcement learning framework that employs an autoencoder to compress high-dimensional occupancy grid maps into low-dimensional latent states. The system is trained in topologically complex environments generated using Perlin noise. A decentralized coordination mechanism combined with a weighted consensus strategy—featuring tunable trust parameters—is introduced to enhance robustness when communication is limited. Experimental results demonstrate that the proposed approach significantly improves scalability, generalization, and collaborative efficiency in unknown environments, maintaining high performance even as the number of agents increases or environmental structures undergo drastic changes.

Technology Category

Application Category

📝 Abstract

Autonomous mapping of unknown environments is a critical challenge, particularly in scenarios where time is limited. Multi-agent systems can enhance efficiency through collaboration, but the scalability of motion-planning algorithms remains a key limitation. Reinforcement learning has been explored as a solution, but existing approaches are constrained by the limited input size required for effective learning, restricting their applicability to discrete environments. This work addresses that limitation by leveraging autoencoders to perform dimensionality reduction, compressing high-fidelity occupancy maps into latent state vectors while preserving essential spatial information. Additionally, we introduce a novel procedural generation algorithm based on Perlin noise, designed to generate topologically complex training environments that simulate asteroid fields, caves and forests. These environments are used for training the autoencoder and the navigation algorithm using a hierarchical deep reinforcement learning framework for decentralized coordination. We introduce a weighted consensus mechanism that modulates reliance on shared data via a tuneable trust parameter, ensuring robustness to accumulation of errors. Experimental results demonstrate that the proposed system scales effectively with number of agents and generalizes well to unfamiliar, structurally distinct environments and is resilient in communication-constrained settings.

Problem

Research questions and friction points this paper is trying to address.

multi-robot exploration

autonomous mapping

scalability

reinforcement learning

high-dimensional environments

Innovation

Methods, ideas, or system contributions that make the work stand out.

Latent Space Reinforcement Learning

Autoencoder-based Dimensionality Reduction

Procedural Environment Generation