🤖 AI Summary
Prior research on LLM tool use has been predominantly confined to English, leaving a critical gap in multilingual settings—particularly for low-resource languages like Arabic. Method: This work presents the first systematic investigation of tool-use capabilities in Arabic LLMs. To address data scarcity, we introduce the first open-source Arabic tool-use dataset and propose a three-stage optimization framework: (1) cross-lingual transfer initialization, (2) general instruction tuning to enhance zero-shot generalization, and (3) targeted fine-tuning on high-priority tools (e.g., calculator, date parser) to strengthen domain-specific proficiency. Contribution/Results: Experiments demonstrate that localized Arabic data boosts tool-call accuracy by +28.6%; general instruction tuning yields consistent gains; and targeted fine-tuning further delivers substantial performance improvements. Our dataset and methodology establish foundational resources for developing practical Arabic AI agents.
📝 Abstract
Tool calling is a critical capability that allows Large Language Models (LLMs) to interact with external systems, significantly expanding their utility. However, research and resources for tool calling are predominantly English-centric, leaving a gap in our understanding of how to enable this functionality for other languages, such as Arabic. This paper investigates three key research questions: (1) the necessity of in-language (Arabic) tool-calling data versus relying on cross-lingual transfer, (2) the effect of general-purpose instruction tuning on tool-calling performance, and (3) the value of fine-tuning on specific, high-priority tools. To address these questions, we conduct extensive experiments using base and post-trained variants of an open-weight Arabic LLM. To enable this study, we bridge the resource gap by translating and adapting two open-source tool-calling datasets into Arabic. Our findings provide crucial insights into the optimal strategies for developing robust tool-augmented agents for Arabic.