Tools as Continuous Flow for Evolving Agentic Reasoning

📅 2026-05-08

📈 Citations: 0

✨ Influential: 0

career value

172K/year

🤖 AI Summary

This work addresses the limitations of existing large language models in tool-augmented reasoning, which typically adopt a step-by-step paradigm lacking global planning capabilities, leading to error accumulation in long-horizon tasks and restricted generalization. The paper introduces a novel approach that formulates chain-of-tool-use as a continuous trajectory generation problem in semantic space, leveraging conditional flow matching to enable coherent and robust agent reasoning. The authors establish the first closed-loop evaluation benchmark tailored for plan-level reasoning and provide theoretical guarantees that their method ensures utility convergence, error decay, and generalization robustness. Experimental results demonstrate that the proposed method significantly outperforms current stepwise baselines on long-horizon tasks.

📝 Abstract

Large Language Models (LLMs) have demonstrated remarkable capabilities in orchestrating tools for reasoning tasks. However, existing methods rely on a step-wise paradigm that lacks a global perspective, which causes error accumulation over long horizons and restricts generalization to unseen tools. To overcome these limitations, we propose Tools as Continuous Flow for Evolving Agentic Reasoning (FlowAgent), which reconceptualizes tool chaining as continuous trajectory generation within a semantic space. To systematically evaluate this paradigm, we introduce the first plan-level closed-loop benchmark dedicated to plan-level agentic reasoning in dynamic real-world environments. Specifically, the proposed FlowAgent leverages conditional flow matching to generate continuous latent trajectories, providing a global planning perspective to ensure coherent and robust tool execution. Theoretically, we establish formal bounds on utility convergence and prove that our continuous formulation fundamentally guarantees robust generalization and error attenuation. Empirical evaluations show that FlowAgent achieves superior robustness and adaptability in long-horizon reasoning tasks.

Problem

Research questions and friction points this paper is trying to address.

tool chaining

long-horizon reasoning

error accumulation

generalization

agentic reasoning

Innovation

Methods, ideas, or system contributions that make the work stand out.

continuous flow

tool chaining

agentic reasoning