Collaborative Learning in Agentic Systems: A Collective AI is Greater Than the Sum of Its Parts

📅 2025-06-05

📈 Citations: 0

✨ Influential: 0

career value

269K/year

🤖 AI Summary

This work addresses the scalability challenge of individual continual learning and collective co-evolution in decentralized, asynchronous, goal-free, and bandwidth-constrained multi-agent reinforcement learning (MARL). We propose the first controller-free modular knowledge sharing framework: it leverages Wasserstein embeddings for policy similarity measurement and dynamic composition; introduces a neural-mask-driven asynchronous policy ensemble mechanism; and spontaneously induces a curriculum from easy to hard tasks. Evaluated on multiple standard RL benchmarks, our method significantly improves sample efficiency—solving certain tasks solely through collaboration. Both individual policy performance and emergent collective capabilities improve concurrently. To our knowledge, this is the first approach achieving robust, scalable cooperative learning dynamics under fully decentralized settings, without global coordination, explicit reward shaping, or centralized training.

Technology Category

Application Category

📝 Abstract

Agentic AI has gained significant interest as a research paradigm focused on autonomy, self-directed learning, and long-term reliability of decision making. Real-world agentic systems operate in decentralized settings on a large set of tasks or data distributions with constraints such as limited bandwidth, asynchronous execution, and the absence of a centralized model or even common objectives. We posit that exploiting previously learned skills, task similarities, and communication capabilities in a collective of agentic AI are challenging but essential elements to enabling scalability, open-endedness, and beneficial collaborative learning dynamics. In this paper, we introduce Modular Sharing and Composition in Collective Learning (MOSAIC), an agentic algorithm that allows multiple agents to independently solve different tasks while also identifying, sharing, and reusing useful machine-learned knowledge, without coordination, synchronization, or centralized control. MOSAIC combines three mechanisms: (1) modular policy composition via neural network masks, (2) cosine similarity estimation using Wasserstein embeddings for knowledge selection, and (3) asynchronous communication and policy integration. Results on a set of RL benchmarks show that MOSAIC has a greater sample efficiency than isolated learners, i.e., it learns significantly faster, and in some cases, finds solutions to tasks that cannot be solved by isolated learners. The collaborative learning and sharing dynamics are also observed to result in the emergence of ideal curricula of tasks, from easy to hard. These findings support the case for collaborative learning in agentic systems to achieve better and continuously evolving performance both at the individual and collective levels.

Problem

Research questions and friction points this paper is trying to address.

Enabling scalable collaborative learning in decentralized agentic AI systems

Sharing and reusing knowledge among agents without centralized control

Improving sample efficiency and task-solving capabilities through collective learning

Innovation

Methods, ideas, or system contributions that make the work stand out.

Modular policy composition via neural masks

Cosine similarity with Wasserstein embeddings

Asynchronous communication and policy integration

🔎 Similar Papers

No similar papers found.