ProST: Progressive Sub-task Training for Pareto-Optimal Multi-agent Systems Using Small Language Models

📅 2025-09-02

📈 Citations: 0

✨ Influential: 0

career value

231K/year

🤖 AI Summary

Small language models (SLMs) suffer from weak subtask generalization and limited performance in multi-agent systems due to difficulties in learning long-horizon trajectories and modeling long-range dependencies. Method: We propose a progressive subtask training strategy that integrates analogy-based instance-level curriculum learning with role specialization and Pareto-optimal analysis, optimizing SLM collaboration hierarchically—from simple to complex, and from local subtasks to global coordination. Contribution/Results: Our approach systematically alleviates the long-range dependency bottleneck while maintaining low computational overhead. Experiments across diverse configurations demonstrate consistent reductions in subtask error rates, achieving superior Pareto-optimal trade-offs between efficiency and effectiveness. The method outperforms standard fine-tuning and end-to-end joint training baselines, establishing a scalable framework for enhancing multi-agent cooperation with SLMs.

Technology Category

Application Category

📝 Abstract

Multi-agent systems with smaller language models (SLMs) present a viable alternative to single agent systems powered by large language models (LLMs) for addressing complex problems. In this work, we study how these alternatives compare in terms of both effectiveness and efficiency. To study this trade-off, we instantiate single and multi-agent systems for the complex problems in the AppWorld environment using different sized language models. We find that difficulties with long-trajectory learning in smaller language models (SLMs) limit their performance. Even when trained for specialized roles, SLMs fail to learn all subtasks effectively. To address this issue, we introduce a simple progressive sub-task training strategy, which introduces new sub-tasks progressively in each training epoch. We find that this novel strategy, analogous to instance level curriculum learning, consistently improves the effectiveness of multi-agents at all configurations. Our Pareto analysis shows that fine-tuned multi-agent systems yield better effectiveness-efficiency trade-offs. Additional ablations and analyses shows the importance of our progressive training strategy and its ability to reduce subtask error rates.

Problem

Research questions and friction points this paper is trying to address.

Addressing long-trajectory learning difficulties in small language models

Improving multi-agent system effectiveness through progressive sub-task training

Achieving better efficiency-effectiveness trade-offs with fine-tuned multi-agents

Innovation

Methods, ideas, or system contributions that make the work stand out.

Progressive sub-task training strategy

Improves multi-agent system effectiveness

Better effectiveness-efficiency trade-offs

🔎 Similar Papers

Multi-Agent Collaborative Data Selection for Efficient LLM Pretraining