🤖 AI Summary
This work addresses the lack of a unified benchmark in existing research on model collaboration, which hinders systematic evaluation of diverse collaborative strategies. To bridge this gap, we propose MoCo—a modular Python library that, for the first time, systematically integrates 26 collaboration methods, supporting multi-granularity information exchange at the levels of routing, text, logits, and parameters. MoCo incorporates 25 diverse evaluation datasets and provides an extensible framework for heterogeneous model collaboration alongside efficient analysis tools, thereby establishing model collaboration as a distinct research paradigm. Experimental results demonstrate that collaborative strategies outperform single-model baselines in 61.0% of (model, dataset) configurations, with the best-performing method achieving a performance gain of up to 25.8%, highlighting the substantial advantages of collaboration in complex tasks.
📝 Abstract
Advancing beyond single monolithic language models (LMs), recent research increasingly recognizes the importance of model collaboration, where multiple LMs collaborate, compose, and complement each other. Existing research on this topic has mostly been disparate and disconnected, from different research communities, and lacks rigorous comparison. To consolidate existing research and establish model collaboration as a school of thought, we present MoCo: a one-stop Python library of executing, benchmarking, and comparing model collaboration algorithms at scale. MoCo features 26 model collaboration methods, spanning diverse levels of cross-model information exchange such as routing, text, logit, and model parameters. MoCo integrates 25 evaluation datasets spanning reasoning, QA, code, safety, and more, while users could flexibly bring their own data. Extensive experiments with MoCo demonstrate that most collaboration strategies outperform models without collaboration in 61.0% of (model, data) settings on average, with the most effective methods outperforming by up to 25.8%. We further analyze the scaling of model collaboration strategies, the training/inference efficiency of diverse methods, highlight that the collaborative system solves problems where single LMs struggle, and discuss future work in model collaboration, all made possible by MoCo. We envision MoCo as a valuable toolkit to facilitate and turbocharge the quest for an open, modular, decentralized, and collaborative AI future.