MagicAgent: Towards Generalized Agent Planning

📅 2026-02-21

📈 Citations: 0

✨ Influential: 0

career value

201K/year

🤖 AI Summary

This work addresses the limited generalization of current large language models in diverse planning tasks, primarily caused by the scarcity of high-quality interactive data and gradient conflicts during multi-task training. To overcome these challenges, we propose MagicAgent, a foundational planning model that introduces a lightweight and scalable synthetic trajectory generation framework. This framework integrates hierarchical task decomposition, tool augmentation, and multi-constraint scheduling to produce synthetic data spanning a broad spectrum of planning scenarios. A two-stage training paradigm—supervised fine-tuning followed by multi-objective reinforcement learning—effectively mitigates inter-task interference and substantially enhances cross-task generalization. Experimental results demonstrate that MagicAgent-32B and MagicAgent-30B-A3B significantly outperform existing open- and closed-source models on benchmarks such as Worfbench and NaturalPlan, achieving a peak accuracy of 86.9%.

Technology Category

Application Category

📝 Abstract

The evolution of Large Language Models (LLMs) from passive text processors to autonomous agents has established planning as a core component of modern intelligence. However, achieving generalized planning remains elusive, not only by the scarcity of high-quality interaction data but also by inherent conflicts across heterogeneous planning tasks. These challenges result in models that excel at isolated tasks yet struggle to generalize, while existing multi-task training attempts suffer from gradient interference. In this paper, we present \textbf{MagicAgent}, a series of foundation models specifically designed for generalized agent planning. We introduce a lightweight and scalable synthetic data framework that generates high-quality trajectories across diverse planning tasks, including hierarchical task decomposition, tool-augmented planning, multi-constraint scheduling, procedural logic orchestration, and long-horizon tool execution. To mitigate training conflicts, we propose a two-stage training paradigm comprising supervised fine-tuning followed by multi-objective reinforcement learning over both static datasets and dynamic environments. Empirical results demonstrate that MagicAgent-32B and MagicAgent-30B-A3B deliver superior performance, achieving accuracies of $75.1\%$ on Worfbench, $55.9\%$ on NaturalPlan, $57.5\%$ on $τ^2$-Bench, $86.9\%$ on BFCL-v3, and $81.2\%$ on ACEBench, as well as strong results on our in-house MagicEval benchmarks. These results substantially outperform existing sub-100B models and even surpass leading closed-source models.

Problem

Research questions and friction points this paper is trying to address.

generalized planning

large language models

heterogeneous tasks

data scarcity

gradient interference

Innovation

Methods, ideas, or system contributions that make the work stand out.

generalized agent planning

synthetic data generation

multi-objective reinforcement learning