LangMARL: Natural Language Multi-Agent Reinforcement Learning

📅 2026-04-01
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the challenge that large language model agents struggle to learn effectively in dynamic collaborative environments due to sparse global rewards, which fail to provide the causal credit signals necessary for local policy optimization. The study introduces, for the first time, a credit assignment mechanism from multi-agent reinforcement learning into the language space, coupled with a policy gradient evolution framework. By replaying interaction trajectories to infer task-relevant causal relationships, the method generates dense and interpretable feedback signals for individual agents. This approach substantially improves sample efficiency, convergence speed, and generalization under sparse reward conditions, demonstrating consistent effectiveness across a range of cooperative tasks.
📝 Abstract
Large language model (LLM) agents struggle to autonomously evolve coordination strategies in dynamic environments, largely because coarse global outcomes obscure the causal signals needed for local policy refinement. We identify this bottleneck as a multi-agent credit assignment problem, which has long been studied in classical multi-agent reinforcement learning (MARL) but remains underaddressed in LLM-based systems. Building on this observation, we propose LangMARL, a framework that brings credit assignment and policy gradient evolution from cooperative MARL into the language space. LangMARL introduces agent-level language credit assignment, pioneers gradient evolution in language space for policy improvement, and summarizes task-relevant causal relations from replayed trajectories to provide dense feedback and improve convergence under sparse rewards. Extensive experiments across diverse cooperative multi-agent tasks demonstrate improved sample efficiency, interpretability, and strong generalization.
Problem

Research questions and friction points this paper is trying to address.

multi-agent credit assignment
large language models
cooperative multi-agent reinforcement learning
sparse rewards
coordination strategies
Innovation

Methods, ideas, or system contributions that make the work stand out.

language-based credit assignment
gradient evolution in language space
multi-agent reinforcement learning
causal feedback from trajectories
LLM agents
🔎 Similar Papers
No similar papers found.