Hierarchical Reinforcement Learning with Low-Level MPC for Multi-Agent Control

📅 2025-09-19

📈 Citations: 0

✨ Influential: 0

career value

174K/year

🤖 AI Summary

In dynamic, highly constrained environments, end-to-end multi-agent cooperative control suffers from low sample efficiency and poor reliability, while model-based approaches exhibit limited generalization. To address these challenges, this paper proposes a hierarchical reinforcement learning framework: a high-level RL policy performs structured Region-of-Interest (ROI)-guided tactical decision-making, while a low-level Model Predictive Control (MPC) module executes safe motion planning. Our key innovation lies in the explicit coupling of ROI-driven target selection with MPC-based execution—enabling behavioral generalization without predefined reference trajectories. Evaluated on a predator–prey benchmark task, the method achieves significant improvements over end-to-end and masked RL baselines: +23.6% in cumulative reward, −58.4% in collision rate (enhancing safety), and improved group behavioral consistency.

Technology Category

Application Category

📝 Abstract

Achieving safe and coordinated behavior in dynamic, constraint-rich environments remains a major challenge for learning-based control. Pure end-to-end learning often suffers from poor sample efficiency and limited reliability, while model-based methods depend on predefined references and struggle to generalize. We propose a hierarchical framework that combines tactical decision-making via reinforcement learning (RL) with low-level execution through Model Predictive Control (MPC). For the case of multi-agent systems this means that high-level policies select abstract targets from structured regions of interest (ROIs), while MPC ensures dynamically feasible and safe motion. Tested on a predator-prey benchmark, our approach outperforms end-to-end and shielding-based RL baselines in terms of reward, safety, and consistency, underscoring the benefits of combining structured learning with model-based control.

Problem

Research questions and friction points this paper is trying to address.

Achieving safe coordinated multi-agent control in constrained environments

Combining reinforcement learning with model predictive control

Improving sample efficiency and safety over end-to-end methods

Innovation

Methods, ideas, or system contributions that make the work stand out.

Hierarchical RL with MPC integration

High-level policies select abstract targets

MPC ensures safe dynamic motion

🔎 Similar Papers

Adaptive Task Allocation in Multi-Human Multi-Robot Teams under Team Heterogeneity and Dynamic Information Uncertainty