ORION: Option-Regularized Deep Reinforcement Learning for Cooperative Multi-Agent Online Navigation

📅 2026-01-03

🏛️ arXiv.org

📈 Citations: 0

✨ Influential: 0

career value

210K/year

🤖 AI Summary

This work addresses the challenge of multi-agent cooperative online navigation in partially known environments by proposing the ORION framework, which effectively balances individual path optimality with team-level information sharing. Built upon imperfect prior maps and operating within a decentralized architecture, ORION integrates a shared graph encoder to fuse prior knowledge with online perception. It further incorporates an options-critic reinforcement learning mechanism and a two-stage collaboration strategy, enabling agents to adaptively switch between navigation and exploration behaviors. Experimental results demonstrate that ORION significantly outperforms existing methods in both maze and large-scale warehouse simulation environments, while real-world robot team deployments validate its efficiency, robustness, and practical applicability.

Technology Category

Application Category

📝 Abstract

Existing methods for multi-agent navigation typically assume fully known environments, offering limited support for partially known scenarios such as warehouses or factory floors. There, agents may need to plan trajectories that balance their own path optimality with their ability to collect and share information about the environment that can help their teammates reach their own goals. To these ends, we propose ORION, a novel deep reinforcement learning framework for cooperative multi-agent online navigation in partially known environments. Starting from an imperfect prior map, ORION trains agents to make decentralized decisions, coordinate to reach their individual targets, and actively reduce map uncertainty by sharing online observations in a closed perception-action loop. We first design a shared graph encoder that fuses prior map with online perception into a unified representation, providing robust state embeddings under dynamic map discrepancies. At the core of ORION is an option-critic framework that learns to reason about a set of high-level cooperative modes that translate into sequences of low-level actions, allowing agents to switch between individual navigation and team-level exploration adaptively. We further introduce a dual-stage cooperation strategy that enables agents to assist teammates under map uncertainty, thereby reducing the overall makespan. Across extensive maze-like maps and large-scale warehouse environments, our simulation results show that ORION achieves high-quality, real-time decentralized cooperation over varying team sizes, outperforming state-of-the-art classical and learning-based baselines. Finally, we validate ORION on physical robot teams, demonstrating its robustness and practicality for real-world cooperative navigation.

Problem

Research questions and friction points this paper is trying to address.

multi-agent navigation

partially known environments

map uncertainty

cooperative navigation

online perception

Innovation

Methods, ideas, or system contributions that make the work stand out.

Option-Critic Framework

Multi-Agent Navigation

Partially Known Environments