MiroFlow: Towards High-Performance and Robust Open-Source Agent Framework for General Deep Research Tasks

📅 2026-02-26

📈 Citations: 0

✨ Influential: 0

career value

270K/year

🤖 AI Summary

Current large language models exhibit limited performance on complex tasks requiring interaction with external tools and dynamic environments. Prevailing agent frameworks often suffer from rigid workflows, poor stability, narrow task coverage, and reliance on costly commercial APIs. To address these limitations, this work proposes a high-performance, fully open-source agent framework that introduces a flexibly composable agent graph structure, an optional deep reasoning mode, and a robust workflow execution mechanism. These innovations significantly enhance the system’s autonomy, robustness, and cross-task generalization in complex scenarios. Built upon open-source large language models and integrating diverse tool-calling interfaces without dependence on commercial APIs, the framework achieves state-of-the-art results across multiple authoritative benchmarks—including GAIA, BrowseComp-EN/ZH, HLE, xBench-DeepSearch, and FutureX—establishing a reproducible and easily comparable open-source baseline for agent research.

Technology Category

Application Category

📝 Abstract

Despite the remarkable progress of large language models (LLMs), the capabilities of standalone LLMs have begun to plateau when tackling real-world, complex tasks that require interaction with external tools and dynamic environments. Although recent agent frameworks aim to enhance model autonomy through tool integration and external interaction, they still suffer from naive workflows, unstable performance, limited support across diverse benchmarks and tasks, and heavy reliance on costly commercial APIs. In this work, we propose a high-performance and robust open-source agent framework, termed MiroFlow, which incorporates an agent graph for flexible orchestration, an optional deep reasoning mode to enhance performance, and a robust workflow execution to ensure stable and reproducible performance. Extensive experiments demonstrate that MiroFlow consistently achieves state-of-the-art performance across multiple agent benchmarks, including GAIA, BrowseComp-EN/ZH, HLE, xBench-DeepSearch, and notably FutureX. We hope it could serve as an easily accessible, reproducible, and comparable baseline for the deep research community.

Problem

Research questions and friction points this paper is trying to address.

agent framework

large language models

tool integration

workflow stability

benchmark diversity

Innovation

Methods, ideas, or system contributions that make the work stand out.

agent framework

agent graph

deep reasoning