🤖 AI Summary
This work addresses the challenge of enabling multi-agent systems to achieve sub-second real-time coordination while continuously adapting across episodes under strict online token budgets in highly collaborative tasks. The authors propose an active co-evolution framework that formulates collaboration as a cross-episode closed-loop optimization problem. By integrating a fast-slow memory separation mechanism, a structured skill library, and post-hoc co-optimization—augmented with hierarchical task network (HTN)-based skill retrieval and patch-style integration—the framework enables interpretable and composable co-evolution under explicit token budget constraints and drift regularization. Experiments demonstrate that the method significantly improves cumulative performance on real-time collaboration benchmarks such as Overcooked-AI, while consistently reducing both online latency and token consumption.
📝 Abstract
Large language models are enabling language-conditioned agents in interactive environments, but highly cooperative tasks often impose two simultaneous constraints: sub-second real-time coordination and sustained multi-episode adaptation under a strict online token budget. Existing approaches either rely on frequent in-episode reasoning that induces latency and timing jitter, or deliver post-episode improvements through unstructured text that is difficult to compile into reliable low-cost execution. We propose CoWork-X, an active co-evolution framework that casts peer collaboration as a closed-loop optimization problem across episodes, inspired by fast--slow memory separation. CoWork-X instantiates a Skill-Agent that executes via HTN (hierarchical task network)-based skill retrieval from a structured, interpretable, and compositional skill library, and a post-episode Co-Optimizer that performs patch-style skill consolidation with explicit budget constraints and drift regularization. Experiments in challenging Overcooked-AI-like realtime collaboration benchmarks demonstrate that CoWork-X achieves stable, cumulative performance gains while steadily reducing online latency and token usage.