๐ค AI Summary
This work addresses the challenge of multi-robot coordination in cluttered environments, where physical constraints such as collisions and kinematic infeasibility hinder effective collaboration. To tackle this, the authors propose a hybrid two-layer planning framework that jointly optimizes high-level task allocation and low-level motion planning. The key innovation lies in a concise waypoint-based trajectory parameterization and a curriculum learningโinspired credit assignment mechanism that efficiently propagates motion feasibility feedback from the planning layer to the task layer. This integration is further enhanced by an improved RLVR reinforcement learning algorithm, enabling end-to-end joint optimization. Experiments on the BoxNet3D-OBS benchmark demonstrate that the proposed method significantly outperforms motion-agnostic and VLA baselines, consistently achieving higher task success rates in complex scenarios with dense obstacles and up to nine robots.
๐ Abstract
Multi-robot control in cluttered environments is a challenging problem that involves complex physical constraints, including robot-robot collisions, robot-obstacle collisions, and unreachable motions. Successful planning in such settings requires joint optimization over high-level task planning and low-level motion planning, as violations of physical constraints may arise from failures at either level. However, jointly optimizing task and motion planning is difficult due to the complex parameterization of low-level motion trajectories and the ambiguity of credit assignment across the two planning levels. In this paper, we propose a hybrid multi-robot control framework that jointly optimizes task and motion planning. To enable effective parameterization of low-level planning, we introduce waypoints, a simple yet expressive representation for motion trajectories. To address the credit assignment challenge, we adopt a curriculum-based training strategy with a modified RLVR algorithm that propagates motion feasibility feedback from the motion planner to the task planner. Experiments on BoxNet3D-OBS, a challenging multi-robot benchmark with dense obstacles and up to nine robots, show that our approach consistently improves task success over motion-agnostic and VLA-based baselines. Our code is available at https://github.com/UCSB-NLP-Chang/navigate-cluster