🤖 AI Summary
Existing simulation-based multi-turn agent data generation methods rely on autoregressive, multi-LLM interactions, resulting in high computational cost and poor scalability. Method: We propose ToolACE-MT, a non-autoregressive iterative framework that constructs high-quality multi-turn dialogue trajectories via a three-stage paradigm: (1) coarse-grained initialization, (2) mask-filling-based multi-turn refinement, and (3) offline joint validation using both rule-based heuristics and model-based scoring. Crucially, ToolACE-MT abandons turn-by-turn generation, instead producing full dialogues in one pass and iteratively optimizing them to explicitly model complex tool invocation patterns and dynamic user-agent interactions. Contribution/Results: Experiments demonstrate substantial improvements in generation efficiency—up to 3.2× faster than autoregressive baselines—while the synthesized data significantly enhances the generalization capability and deployment robustness of tool-augmented large language models across diverse real-world scenarios.
📝 Abstract
Agentic task-solving with Large Language Models (LLMs) requires multi-turn, multi-step interactions, often involving complex function calls and dynamic user-agent exchanges. Existing simulation-based data generation methods for such scenarios rely heavily on costly autoregressive interactions between multiple LLM agents, thereby limiting real-world performance of agentic tasks. In this paper, we propose a novel Non-Autoregressive Iterative Generation framework, called ToolACE-MT, for constructing high-quality multi-turn agentic dialogues. ToolACE-MT generates full conversational trajectories through three stages: coarse-grained initialization, iterative refinement, and offline verification. The initialization phase builds a structurally complete yet semantically coarse dialogue skeleton; the iterative refinement phase introduces realistic complexities and continued refinement via mask-and-fill operations; and the offline verification phase ensures correctness and coherence via rule- and model-based checks. Experiments demonstrate that ToolACE-MT enables efficient, effective and generalizable agentic data generation, offering a new paradigm for high-quality data construction in tool-augmented LLM scenarios.