CoLF: Learning Consistent Leader-Follower Policies for Vision-Language-Guided Multi-Robot Cooperative Transport

📅 2026-02-08
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the challenge of perceptual inconsistency in vision-language-guided cooperative transport among multiple robots, which arises from divergent viewpoints and linguistic ambiguity. To this end, the authors propose the CoLF framework, which employs a dependency-based leader-follower architecture integrating an asymmetric policy network and a mutual information maximization mechanism. This design enables followers to effectively predict the leader’s actions from their local observations, thereby achieving stable role assignment and consistent collaboration. The approach is formulated within a multi-agent reinforcement learning setting and optimized under the centralized training with decentralized execution (CTDE) paradigm by maximizing a variational lower bound on mutual information. Experimental results on both simulated and real quadruped robot platforms demonstrate that the proposed method significantly improves task success rates and collaborative stability.

Technology Category

Application Category

📝 Abstract
In this study, we address vision-language-guided multi-robot cooperative transport, where each robot grounds natural-language instructions from onboard camera observations. A key challenge in this decentralized setting is perceptual misalignment across robots, where viewpoint differences and language ambiguity can yield inconsistent interpretations and degrade cooperative transport. To mitigate this problem, we adopt a dependent leader-follower design, where one robot serves as the leader and the other as the follower. Although such a leader-follower structure appears straightforward, learning with independent and symmetric agents often yields symmetric or unstable behaviors without explicit inductive biases. To address this challenge, we propose Consistent Leader-Follower (CoLF), a multi-agent reinforcement learning (MARL) framework for stable leader-follower role differentiation. CoLF consists of two key components: (1) an asymmetric policy design that induces leader-follower role differentiation, and (2) a mutual-information-based training objective that maximizes a variational lower bound, encouraging the follower to predict the leader's action from its local observation. The leader and follower policies are jointly optimized under the centralized training and decentralized execution (CTDE) framework to balance task execution and consistent cooperative behaviors. We validate CoLF in both simulation and real-robot experiments using two quadruped robots. The demonstration video is available at https://sites.google.com/view/colf/.
Problem

Research questions and friction points this paper is trying to address.

multi-robot cooperative transport
vision-language grounding
perceptual misalignment
leader-follower coordination
decentralized execution
Innovation

Methods, ideas, or system contributions that make the work stand out.

leader-follower
multi-agent reinforcement learning
perceptual alignment
mutual information
vision-language grounding
🔎 Similar Papers
No similar papers found.
J
Joachim Yann Despature
Division of Information Science, Graduate School of Science and Technology, Nara Institute of Science and Technology (NAIST), Nara, Japan; École polytechnique fédérale de Lausanne (EPFL), Lausanne, Switzerland
K
Kazuki Shibata
Division of Information Science, Graduate School of Science and Technology, Nara Institute of Science and Technology (NAIST), Nara, Japan
Takamitsu Matsubara
Takamitsu Matsubara
Nara Institute of Science and Technology
Robot LearningMachine LearningReinforcement LearningRobotics