Learning Multi-Agent Loco-Manipulation for Long-Horizon Quadrupedal Pushing

📅 2024-11-11

🏛️ arXiv.org

📈 Citations: 1

✨ Influential: 0

career value

257K/year

🤖 AI Summary

Quadrupedal robots exhibit limited capability in long-horizon, obstacle-aware manipulation of large objects within complex environments. Method: This paper proposes a hierarchical multi-agent reinforcement learning framework: a centralized adaptive policy coordinated with RRT-based planning generates high-level task commands; a decentralized goal-conditioned policy enables collaborative decision-making among multiple robots at the mid-level; and a pre-trained locomotion controller ensures robust motion generation at the low level. The framework supports centralized training with decentralized execution (CTDE) and enables efficient sim-to-real transfer on the Go1 platform. Results: Experiments demonstrate a 36.0% improvement in task success rate and a 24.5% reduction in completion time in simulation. Crucially, the method achieves the first successful real-robot execution of long-horizon, obstacle-aware pushing tasks—Push-Cuboid and Push-T—on physical quadrupeds, significantly enhancing operational practicality for applications such as search-and-rescue and industrial automation.

Technology Category

Application Category

📝 Abstract

Recently, quadrupedal locomotion has achieved significant success, but their manipulation capabilities, particularly in handling large objects, remain limited, restricting their usefulness in demanding real-world applications such as search and rescue, construction, industrial automation, and room organization. This paper tackles the task of obstacle-aware, long-horizon pushing by multiple quadrupedal robots. We propose a hierarchical multi-agent reinforcement learning framework with three levels of control. The high-level controller integrates an RRT planner and a centralized adaptive policy to generate subgoals, while the mid-level controller uses a decentralized goal-conditioned policy to guide the robots toward these sub-goals. A pre-trained low-level locomotion policy executes the movement commands. We evaluate our method against several baselines in simulation, demonstrating significant improvements over baseline approaches, with 36.0% higher success rates and 24.5% reduction in completion time than the best baseline. Our framework successfully enables long-horizon, obstacle-aware manipulation tasks like Push-Cuboid and Push-T on Go1 robots in the real world.

Problem

Research questions and friction points this paper is trying to address.

Enhancing quadrupedal robots' manipulation capabilities for large objects.

Developing obstacle-aware, long-horizon pushing tasks with multiple robots.

Improving success rates and reducing completion time in real-world applications.

Innovation

Methods, ideas, or system contributions that make the work stand out.

Hierarchical multi-agent reinforcement learning framework

RRT planner and centralized adaptive policy

Decentralized goal-conditioned policy for sub-goals

🔎 Similar Papers

Multi-Agent Behavior Retrieval: Retrieval-Augmented Policy Training for Cooperative Push Manipulation by Mobile Robots