Do LLM-derived graph priors improve multi-agent coordination?

📅 2026-04-18

📈 Citations: 0

✨ Influential: 0

career value

207K/year

🤖 AI Summary

This work addresses the challenge of efficiently constructing coordination graph structures in multi-agent reinforcement learning, where existing approaches—relying on manual design, heuristic rules, or extensive environment interactions—suffer from fragility, lack of semantic grounding, or poor data efficiency. To overcome these limitations, the paper proposes a novel method that leverages a large language model (LLM) to generate coordination graph priors directly from brief natural language descriptions, which are then integrated into the graph convolutional layers of a graph neural network to guide policy learning. Experimental results demonstrate that even a small, open-source LLM with only 1.5 billion parameters substantially enhances coordination and adaptability in dynamic multi-agent environments. The approach consistently outperforms various baselines across four cooperative scenarios in the Multi-Agent Particle Environment, validating both the efficacy of LLM-derived priors and their robustness to model scale.

Technology Category

Application Category

📝 Abstract

Multi-agent reinforcement learning (MARL) is crucial for AI systems that operate collaboratively in distributed and adversarial settings, particularly in multi-domain operations (MDO). A central challenge in cooperative MARL is determining how agents should coordinate: existing approaches must either hand-specify graph topology, rely on proximity-based heuristics, or learn structure entirely from environment interaction; all of which are brittle, semantically uninformed, or data-intensive. We investigate whether large language models (LLMs) can generate useful coordination graph priors for MARL by using minimal natural language descriptions of agent observations to infer latent coordination patterns. These priors are integrated into MARL algorithms via graph convolutional layers within a graph neural network (GNN)-based pipeline, and evaluated on four cooperative scenarios from the Multi-Agent Particle Environment (MPE) benchmark against baselines spanning the full spectrum of coordination modeling, from independent learners to state-of-the-art graph-based methods. We further ablate across five compact open-source LLMs to assess the sensitivity of prior quality to model choice. Our results provide the first quantitative evidence that LLM-derived graph priors can enhance coordination and adaptability in dynamic multi-agent environments, and demonstrate that models as small as 1.5B parameters are sufficient for effective prior generation.

Problem

Research questions and friction points this paper is trying to address.

multi-agent reinforcement learning

coordination graphs

graph priors

large language models

multi-agent coordination

Innovation

Methods, ideas, or system contributions that make the work stand out.

LLM-derived graph priors

multi-agent reinforcement learning

coordination graphs