FlowReasoner: Reinforcing Query-Level Meta-Agents

📅 2025-04-21

📈 Citations: 0

✨ Influential: 0

career value

191K/year

🤖 AI Summary

To address the challenge of dynamically constructing customized multi-agent systems for each user query, this paper proposes FlowReasoner—a query-level meta-agent framework. FlowReasoner end-to-end generates a dedicated multi-agent workflow tailored to a single query, leveraging deep reasoning for architectural design and jointly optimizing via external execution feedback and PPO-based reinforcement learning. Its contributions are threefold: (1) it introduces the first query-granular meta-agent paradigm; (2) it designs a differentiable, multi-objective reward function that jointly balances performance, complexity, and efficiency; and (3) it initializes the policy with knowledge distillation from DeepSeek R1 to enhance convergence quality. Evaluated on engineering and competitive programming code benchmarks, FlowReasoner achieves an average accuracy gain of 10.52% over o1-mini and significantly outperforms existing baselines.

Technology Category

Application Category

📝 Abstract

This paper proposes a query-level meta-agent named FlowReasoner to automate the design of query-level multi-agent systems, i.e., one system per user query. Our core idea is to incentivize a reasoning-based meta-agent via external execution feedback. Concretely, by distilling DeepSeek R1, we first endow the basic reasoning ability regarding the generation of multi-agent systems to FlowReasoner. Then, we further enhance it via reinforcement learning (RL) with external execution feedback. A multi-purpose reward is designed to guide the RL training from aspects of performance, complexity, and efficiency. In this manner, FlowReasoner is enabled to generate a personalized multi-agent system for each user query via deliberative reasoning. Experiments on both engineering and competition code benchmarks demonstrate the superiority of FlowReasoner. Remarkably, it surpasses o1-mini by 10.52% accuracy across three benchmarks. The code is available at https://github.com/sail-sg/FlowReasoner.

Problem

Research questions and friction points this paper is trying to address.

Automate design of query-level multi-agent systems

Enhance meta-agent via reinforcement learning feedback

Generate personalized multi-agent systems per query

Innovation

Methods, ideas, or system contributions that make the work stand out.

Query-level meta-agent automates multi-agent design

Reinforcement learning enhances reasoning with feedback

Multi-purpose reward optimizes performance, complexity, efficiency

🔎 Similar Papers

FiDeLiS: Faithful Reasoning in Large Language Model for Knowledge Graph Question Answering