ET-Agent: Incentivizing Effective Tool-Integrated Reasoning Agent via Behavior Calibration

📅 2026-01-11

🏛️ arXiv.org

📈 Citations: 0

✨ Influential: 0

career value

185K/year

🤖 AI Summary

This work addresses the inefficiencies in tool-integrated reasoning (TIR) with large language models, which often stem from redundant or missing tool calls and a lack of effective behavioral calibration. To tackle this, the authors propose ET-Agent, a novel framework that introduces systematic calibration at the behavioral pattern level. ET-Agent leverages a self-evolving data flywheel to generate high-quality reasoning trajectories and employs a two-stage behavioral calibration training strategy to refine the agent’s reasoning paths. This approach significantly enhances the accuracy of tool invocation, improves reasoning conciseness, and boosts overall efficiency, outperforming existing methods across multiple key metrics.

Technology Category

Application Category

📝 Abstract

Large Language Models (LLMs) can extend their parameter knowledge limits by adopting the Tool-Integrated Reasoning (TIR) paradigm. However, existing LLM-based agent training framework often focuses on answers'accuracy, overlooking specific alignment for behavior patterns. Consequently, agent often exhibits ineffective actions during TIR tasks, such as redundant and insufficient tool calls. How to calibrate erroneous behavioral patterns when executing TIR tasks, thereby exploring effective trajectories, remains an open-ended problem. In this paper, we propose ET-Agent, a training framework for calibrating agent's tool-use behavior through two synergistic perspectives: Self-evolving Data Flywheel and Behavior Calibration Training. Specifically, we introduce a self-evolutionary data flywheel to generate enhanced data, used to fine-tune LLM to improve its exploration ability. Based on this, we implement an two-phases behavior-calibration training framework. It is designed to progressively calibrate erroneous behavioral patterns to optimal behaviors. Further in-depth experiments confirm the superiority of \ourmodel{} across multiple dimensions, including correctness, efficiency, reasoning conciseness, and tool execution accuracy. Our ET-Agent framework provides practical insights for research in the TIR field. Codes can be found in https://github.com/asilverlight/ET-Agent

Problem

Research questions and friction points this paper is trying to address.

Tool-Integrated Reasoning

Behavior Calibration

Large Language Models

Agent Training

Tool-use Behavior

Innovation

Methods, ideas, or system contributions that make the work stand out.

Tool-Integrated Reasoning

Behavior Calibration

Self-evolving Data Flywheel