MARLIN: Multi-Agent Reinforcement Learning Guided by Language-Based Inter-Robot Negotiation

📅 2024-10-18
🏛️ arXiv.org
📈 Citations: 1
Influential: 0
📄 PDF
🤖 AI Summary
To address the challenges of prolonged training duration, opaque decision-making processes, and high deployment risks in multi-robot systems trained via multi-agent reinforcement learning (MARL), this paper proposes an LLM-Augmented MARL framework. Our method introduces, for the first time, an LLM-based linguistic negotiation mechanism among robots, dynamically coupling natural-language task coordination with policy learning to enable adaptive mode switching during training. We further design a language-driven action planning module and a hybrid decision architecture, wherein the LLM generates executable task plans and actively guides real-time RL policy updates. Experimental results demonstrate that, while maintaining equivalent task performance, the proposed framework reduces training episodes by an average of 42%. It also accelerates safe deployment from simulation to physical robots, outperforming pure RL baselines in real-world evaluations.

Technology Category

Application Category

📝 Abstract
Multi-agent reinforcement learning is a key method for training multi-robot systems over a series of episodes in which robots are rewarded or punished according to their performance; only once the system is trained to a suitable standard is it deployed in the real world. If the system is not trained enough, the task will likely not be completed and could pose a risk to the surrounding environment. We introduce Multi-Agent Reinforcement Learning guided by Language-based Inter-Robot Negotiation (MARLIN), in which the training process requires fewer training episodes to reach peak performance. Robots are equipped with large language models that negotiate and debate a task, producing plans used to guide the policy during training. The approach dynamically switches between using reinforcement learning and large language model-based action negotiation throughout training. This reduces the number of training episodes required, compared to standard multi-agent reinforcement learning, and hence allows the system to be deployed to physical hardware earlier. The performance of this approach is evaluated against multi-agent reinforcement learning, showing that our hybrid method achieves comparable results with significantly reduced training time.
Problem

Research questions and friction points this paper is trying to address.

Enhance multi-robot system training efficiency
Reduce training time and resource consumption
Improve transparency in robot decision-making
Innovation

Methods, ideas, or system contributions that make the work stand out.

Language-based negotiation enhances multi-agent training.
Dynamic switching between RL and negotiation methods.
Large language models guide robot policy planning.
T
Toby Godfrey
School of Electronics & Computer Science, University of Southampton, Southampton, SO17 1BJ, United Kingdom
William Hunt
William Hunt
Postgraduate Researcher, University of Southampton
M
Mohammad Divband Soorati
School of Electronics & Computer Science, University of Southampton, Southampton, SO17 1BJ, United Kingdom