cMALC-D: Contextual Multi-Agent LLM-Guided Curriculum Learning with Diversity-Based Context Blending

📅 2025-08-28
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Existing context-aware multi-agent reinforcement learning (cMARL) methods exhibit poor generalization in dynamic, complex environments, primarily due to curriculum learning relying on noisy and unstable proxy signals—such as value or advantage estimates—from agents. To address this, we propose an LLM-guided curriculum learning framework that leverages large language models to generate semantically rich, structurally controllable environmental context curricula. We further introduce a diversity-driven context mixing mechanism to mitigate mode collapse and enhance exploration and robustness. Our framework seamlessly integrates cMARL, prompt engineering, and curriculum learning without requiring additional supervision. Evaluated on traffic signal control—a canonical benchmark for multi-agent coordination—the approach achieves significant improvements in cross-scenario generalization and sample efficiency. Empirical results validate the efficacy of semantic guidance in curriculum design for improving policy generalization in multi-agent settings.

Technology Category

Application Category

📝 Abstract
Many multi-agent reinforcement learning (MARL) algorithms are trained in fixed simulation environments, making them brittle when deployed in real-world scenarios with more complex and uncertain conditions. Contextual MARL (cMARL) addresses this by parameterizing environments with context variables and training a context-agnostic policy that performs well across all environment configurations. Existing cMARL methods attempt to use curriculum learning to help train and evaluate context-agnostic policies, but they often rely on unreliable proxy signals, such as value estimates or generalized advantage estimates that are noisy and unstable in multi-agent settings due to inter-agent dynamics and partial observability. To address these issues, we propose Contextual Multi-Agent LLM-Guided Curriculum Learning with Diversity-Based Context Blending (cMALC-D), a framework that uses Large Language Models (LLMs) to generate semantically meaningful curricula and provide a more robust evaluation signal. To prevent mode collapse and encourage exploration, we introduce a novel diversity-based context blending mechanism that creates new training scenarios by combining features from prior contexts. Experiments in traffic signal control domains demonstrate that cMALC-D significantly improves both generalization and sample efficiency compared to existing curriculum learning baselines. We provide code at https://github.com/DaRL-LibSignal/cMALC-D.
Problem

Research questions and friction points this paper is trying to address.

Addresses brittleness of multi-agent reinforcement learning in uncertain real-world conditions
Overcomes unreliable proxy signals in contextual multi-agent curriculum learning
Prevents mode collapse and encourages exploration with diversity-based context blending
Innovation

Methods, ideas, or system contributions that make the work stand out.

LLM-guided curriculum learning for multi-agent
Diversity-based context blending mechanism
Semantically meaningful curricula generation
🔎 Similar Papers
No similar papers found.