🤖 AI Summary
This study addresses the limitations of small language models (<10B parameters)—notably their constrained knowledge and reasoning capabilities—which hinder efficient real-world deployment. While prior work has largely overlooked the potential of agent-based paradigms to compensate for these shortcomings, this paper presents the first systematic evaluation of open-source small models across three settings: base models, single-agent (tool-augmented) systems, and multi-agent collaborative frameworks, analyzing performance–cost trade-offs in each. Experimental results demonstrate that the single-agent paradigm substantially enhances task effectiveness while maintaining high efficiency. In contrast, multi-agent collaboration yields only marginal gains at significantly higher computational overhead. The findings validate lightweight agent architectures as a viable strategy for deploying small models and offer a new agent-centric design pathway for resource-constrained scenarios.
📝 Abstract
Despite the impressive capabilities of large language models, their substantial computational costs, latency, and privacy risks hinder their widespread deployment in real-world applications. Small Language Models (SLMs) with fewer than 10 billion parameters present a promising alternative; however, their inherent limitations in knowledge and reasoning curtail their effectiveness. Existing research primarily focuses on enhancing SLMs through scaling laws or fine-tuning strategies while overlooking the potential of using agent paradigms, such as tool use and multi-agent collaboration, to systematically compensate for the inherent weaknesses of small models. To address this gap, this paper presents the first large-scale, comprehensive study of <10B open-source models under three paradigms: (1) the base model, (2) a single agent equipped with tools, and (3) a multi-agent system with collaborative capabilities. Our results show that single-agent systems achieve the best balance between performance and cost, while multi-agent setups add overhead with limited gains. Our findings highlight the importance of agent-centric design for efficient and trustworthy deployment in resource-constrained settings.