FlowReasoner: Reinforcing Query-Level Meta-Agents

๐Ÿ“… 2025-04-21
๐Ÿ“ˆ Citations: 0
โœจ Influential: 0
๐Ÿ“„ PDF
๐Ÿค– AI Summary
To address the challenge of dynamically constructing customized multi-agent systems for each user query, this paper proposes FlowReasonerโ€”a query-level meta-agent framework. FlowReasoner end-to-end generates a dedicated multi-agent workflow tailored to a single query, leveraging deep reasoning for architectural design and jointly optimizing via external execution feedback and PPO-based reinforcement learning. Its contributions are threefold: (1) it introduces the first query-granular meta-agent paradigm; (2) it designs a differentiable, multi-objective reward function that jointly balances performance, complexity, and efficiency; and (3) it initializes the policy with knowledge distillation from DeepSeek R1 to enhance convergence quality. Evaluated on engineering and competitive programming code benchmarks, FlowReasoner achieves an average accuracy gain of 10.52% over o1-mini and significantly outperforms existing baselines.

Technology Category

Application Category

๐Ÿ“ Abstract
This paper proposes a query-level meta-agent named FlowReasoner to automate the design of query-level multi-agent systems, i.e., one system per user query. Our core idea is to incentivize a reasoning-based meta-agent via external execution feedback. Concretely, by distilling DeepSeek R1, we first endow the basic reasoning ability regarding the generation of multi-agent systems to FlowReasoner. Then, we further enhance it via reinforcement learning (RL) with external execution feedback. A multi-purpose reward is designed to guide the RL training from aspects of performance, complexity, and efficiency. In this manner, FlowReasoner is enabled to generate a personalized multi-agent system for each user query via deliberative reasoning. Experiments on both engineering and competition code benchmarks demonstrate the superiority of FlowReasoner. Remarkably, it surpasses o1-mini by 10.52% accuracy across three benchmarks. The code is available at https://github.com/sail-sg/FlowReasoner.
Problem

Research questions and friction points this paper is trying to address.

Automate design of query-level multi-agent systems
Enhance meta-agent via reinforcement learning feedback
Generate personalized multi-agent systems per query
Innovation

Methods, ideas, or system contributions that make the work stand out.

Query-level meta-agent automates multi-agent design
Reinforcement learning enhances reasoning with feedback
Multi-purpose reward optimizes performance, complexity, efficiency
๐Ÿ”Ž Similar Papers
No similar papers found.