Magnet: Multi-turn Tool-use Data Synthesis and Distillation via Graph Translation

📅 2025-03-10

📈 Citations: 0

✨ Influential: 0

career value

167K/year

🤖 AI Summary

Large language models (LLMs) exhibit low accuracy and poor robustness when invoking multiple external tools across multi-turn human–AI interactions. Method: This paper proposes Magnet, a novel framework featuring (i) graph-based translation for automatic trajectory synthesis—iteratively mapping function signature paths to query-call sequences—and (ii) a dual-stage training strategy combining contrastive prompt distillation (using positive/negative examples) with multi-turn direct preference optimization (mDPO) to enhance generalization over complex tool chains. Technically, Magnet integrates graph modeling, context-aware distillation, supervised fine-tuning (SFT), and mDPO. Contribution/Results: On BFCL-v3 and ToolQuery benchmarks, Magnet-14B-mDPO achieves scores of 68.01 and 73.30, respectively—significantly outperforming Gemini-1.5-pro-002. These results validate the efficacy of graph-structured trajectory generation and multi-turn preference alignment for robust, scalable tool orchestration.

Technology Category

Application Category

📝 Abstract

Large language models (LLMs) have exhibited the ability to effectively utilize external tools to address user queries. However, their performance may be limited in complex, multi-turn interactions involving users and multiple tools. To address this, we propose Magnet, a principled framework for synthesizing high-quality training trajectories to enhance the function calling capability of large language model agents in multi-turn conversations with humans. The framework is based on automatic and iterative translations from a function signature path to a sequence of queries and executable function calls. We model the complicated function interactions in multi-turn cases with graph and design novel node operations to build reliable signature paths. Motivated by context distillation, when guiding the generation of positive and negative trajectories using a teacher model, we provide reference function call sequences as positive hints in context and contrastive, incorrect function calls as negative hints. Experiments show that training with the positive trajectories with supervised fine-tuning and preference optimization against negative trajectories, our 14B model, Magnet-14B-mDPO, obtains 68.01 on BFCL-v3 and 73.30 on ToolQuery, surpassing the performance of the teacher model Gemini-1.5-pro-002 by a large margin in function calling.

Problem

Research questions and friction points this paper is trying to address.

Enhance LLMs' multi-turn tool-use capability

Synthesize high-quality training trajectories

Improve function calling in complex interactions

Innovation

Methods, ideas, or system contributions that make the work stand out.

Graph-based modeling of multi-turn function interactions

Automatic translation of function signatures to queries

Context distillation with positive and negative trajectories

🔎 Similar Papers

No similar papers found.