T1: A Tool-Oriented Conversational Dataset for Multi-Turn Agentic Planning

📅 2025-05-22

📈 Citations: 0

✨ Influential: 0

career value

199K/year

🤖 AI Summary

Modeling cross-tool dependencies and enabling dynamic planning across multi-turn dialogues constitute a core challenge in agent reasoning. Methodologically, we introduce the first tool-dependency-aware dialogue dataset, T1, covering nine single- and multi-domain task compositions and systematically characterizing explicit inter-API dependencies for the first time. Our approach integrates a dual-timescale caching mechanism with a dynamic replanning strategy—supporting adaptive decisions to reuse or recompute prior steps—alongside tool-augmented dialogue construction, multi-domain task orchestration, and cache-aware inference. Experiments demonstrate that the T1-Agent, trained on T1, significantly improves completion rates and reasoning consistency on multi-step dependent tasks. Concurrently, T1 establishes a new open-source benchmark for rigorously evaluating large language models’ planning capabilities.

Technology Category

Application Category

📝 Abstract

Large Language Models (LLMs) have demonstrated impressive capabilities as intelligent agents capable of solving complex problems. However, effective planning in scenarios involving dependencies between API or tool calls-particularly in multi-turn conversations-remains a significant challenge. To address this, we introduce T1, a tool-augmented, multi-domain, multi-turn conversational dataset specifically designed to capture and manage inter-tool dependencies across diverse domains. T1 enables rigorous evaluation of agents' ability to coordinate tool use across nine distinct domains (4 single domain and 5 multi-domain) with the help of an integrated caching mechanism for both short- and long-term memory, while supporting dynamic replanning-such as deciding whether to recompute or reuse cached results. Beyond facilitating research on tool use and planning, T1 also serves as a benchmark for evaluating the performance of open-source language models. We present results powered by T1-Agent, highlighting their ability to plan and reason in complex, tool-dependent scenarios.

Problem

Research questions and friction points this paper is trying to address.

Addressing multi-turn planning challenges with tool dependencies

Evaluating agent coordination across diverse tool domains

Benchmarking open-source models in tool-dependent scenarios

Innovation

Methods, ideas, or system contributions that make the work stand out.

Tool-augmented multi-domain conversational dataset

Integrated caching for short- and long-term memory

Dynamic replanning with recompute or reuse decisions

🔎 Similar Papers

No similar papers found.