A Strategic Coordination Framework of Small LLMs Matches Large LLMs in Data Synthesis

📅 2025-04-11

📈 Citations: 0

✨ Influential: 0

career value

214K/year

🤖 AI Summary

Existing data synthesis and distillation methods heavily rely on large language models (LLMs), incurring prohibitive computational costs, poor environmental sustainability, and heightened bias risks; conversely, single small language models (SLMs) struggle to simultaneously ensure generation quality, diversity, and reliability. To address this, we propose GRA—a novel multi-SLM role-based collaborative framework—where Generator, Reviewer, and Adjudicator modules emulate peer review through prompt-driven role assignment, iterative critique-and-revision, conflict arbitration, and multi-round quality feedback loops, enabling distributed, specialized data synthesis. Experiments demonstrate that GRA matches or surpasses Qwen-2.5-72B-Instruct across multiple benchmarks—marking the first instance of an SLM ensemble achieving data-level performance parity with a 72B-parameter LLM. All code, models, and datasets are publicly released.

Technology Category

Application Category

📝 Abstract

While data synthesis and distillation are promising strategies to enhance small language models, current approaches heavily rely on Large Language Models (LLMs), which suffer from high computational costs, environmental inefficiency, and potential biases inherited from monolithic architectures. In contrast, smaller LLMs are more accessible and sustainable, but their individual capabilities often fall short in generating high-quality, diverse, and reliable data. Inspired by collaborative human processes (e.g., peer review), we propose a multiple small LLMs involved framework, GRA, that aggregates specialized roles across small LLMs to iterative refinement and quality control typically achieved by a single large LLM. In this collaborative framework, multiple small LLMs assume distinct roles-Generator, Reviewer, and Adjudicator-to simulate a peer-review-inspired data synthesis pipeline. The Generator proposes initial data samples, the Reviewer critiques their quality and diversity, and the Adjudicator resolves conflicts to finalize the output. By decomposing the synthesis process into specialized sub-tasks, collaborative small LLMs can achieve data-level parity with large LLM-based distillation. Through experiments across multiple benchmarks, we demonstrate that GRA-produced data matches or exceeds the quality of single large LLM outputs, e.g., Qwen-2.5-72B-Instruct. Our results challenge the necessity of monolithic large models for high-quality data synthesis, advocating instead for strategic coordination of smaller agents. Our datasets, models, and code are publicly available at https://github.com/GX-XinGao/GRA.

Problem

Research questions and friction points this paper is trying to address.

Small LLMs lack quality in data synthesis alone

High costs and biases of large LLMs in synthesis

Coordinating small LLMs to match large LLM performance

Innovation

Methods, ideas, or system contributions that make the work stand out.

Collaborative small LLMs framework replaces large LLMs

Specialized roles: Generator, Reviewer, Adjudicator

Iterative refinement achieves large LLM quality

🔎 Similar Papers

No similar papers found.