ORION: Option-Regularized Deep Reinforcement Learning for Cooperative Multi-Agent Online Navigation

📅 2026-01-03
🏛️ arXiv.org
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the challenge of multi-agent cooperative online navigation in partially known environments by proposing the ORION framework, which effectively balances individual path optimality with team-level information sharing. Built upon imperfect prior maps and operating within a decentralized architecture, ORION integrates a shared graph encoder to fuse prior knowledge with online perception. It further incorporates an options-critic reinforcement learning mechanism and a two-stage collaboration strategy, enabling agents to adaptively switch between navigation and exploration behaviors. Experimental results demonstrate that ORION significantly outperforms existing methods in both maze and large-scale warehouse simulation environments, while real-world robot team deployments validate its efficiency, robustness, and practical applicability.

Technology Category

Application Category

📝 Abstract
Existing methods for multi-agent navigation typically assume fully known environments, offering limited support for partially known scenarios such as warehouses or factory floors. There, agents may need to plan trajectories that balance their own path optimality with their ability to collect and share information about the environment that can help their teammates reach their own goals. To these ends, we propose ORION, a novel deep reinforcement learning framework for cooperative multi-agent online navigation in partially known environments. Starting from an imperfect prior map, ORION trains agents to make decentralized decisions, coordinate to reach their individual targets, and actively reduce map uncertainty by sharing online observations in a closed perception-action loop. We first design a shared graph encoder that fuses prior map with online perception into a unified representation, providing robust state embeddings under dynamic map discrepancies. At the core of ORION is an option-critic framework that learns to reason about a set of high-level cooperative modes that translate into sequences of low-level actions, allowing agents to switch between individual navigation and team-level exploration adaptively. We further introduce a dual-stage cooperation strategy that enables agents to assist teammates under map uncertainty, thereby reducing the overall makespan. Across extensive maze-like maps and large-scale warehouse environments, our simulation results show that ORION achieves high-quality, real-time decentralized cooperation over varying team sizes, outperforming state-of-the-art classical and learning-based baselines. Finally, we validate ORION on physical robot teams, demonstrating its robustness and practicality for real-world cooperative navigation.
Problem

Research questions and friction points this paper is trying to address.

multi-agent navigation
partially known environments
map uncertainty
cooperative navigation
online perception
Innovation

Methods, ideas, or system contributions that make the work stand out.

Option-Critic Framework
Multi-Agent Navigation
Partially Known Environments
Graph-Based State Representation
Decentralized Cooperation
🔎 Similar Papers
No similar papers found.