Learning to Evolve: A Self-Improving Framework for Multi-Agent Systems via Textual Parameter Graph Optimization

📅 2026-04-22

📈 Citations: 0

✨ Influential: 0

career value

192K/year

🤖 AI Summary

This work addresses the heavy reliance on manual design in multi-agent systems and the limitations of existing automated methods, which lack structural awareness and struggle with continuous improvement. To overcome these challenges, the paper proposes the Textual Parameter Graph Optimization (TPGO) framework, which models the system as an optimizable textual parameter graph. TPGO generates structured natural language feedback—termed “textual gradients”—from execution trajectories and integrates a Group-relative Agent Optimization (GRAO) strategy to enable self-evolution grounded in historical experience. This framework introduces, for the first time, a self-optimization mechanism that simultaneously supports structural awareness and continual learning. Empirical results on complex benchmarks such as GAIA and MCP-Universe demonstrate that TPGO significantly enhances the success rate of state-of-the-art agent frameworks, validating the efficacy of automated self-improvement.

Technology Category

Application Category

📝 Abstract

Designing and optimizing multi-agent systems (MAS) is a complex, labor-intensive process of "Agent Engineering." Existing automatic optimization methods, primarily focused on flat prompt tuning, lack the structural awareness to debug the intricate web of interactions in MAS. More critically, these optimizers are static; they do not learn from experience to improve their own optimization strategies. To address these gaps, we introduce Textual Parameter Graph Optimization (TPGO), a framework that enables a multi-agent system to learn to evolve. TPGO first models the MAS as a Textual Parameter Graph (TPG), where agents, tools, and workflows are modular, optimizable nodes. To guide evolution, we derive "textual gradients," structured natural language feedback from execution traces, to pinpoint failures and suggest granular modifications. The core of our framework is Group Relative Agent Optimization (GRAO), a novel meta-learning strategy that learns from historical optimization experiences. By analyzing past successes and failures, GRAO becomes progressively better at proposing effective updates, allowing the system to learn how to optimize itself. Extensive experiments on complex benchmarks like GAIA and MCP-Universe show that TPGO significantly enhances the performance of state-of-the-art agent frameworks, achieving higher success rates through automated, self-improving optimization.

Problem

Research questions and friction points this paper is trying to address.

multi-agent systems

automatic optimization

structural awareness

self-improving

agent engineering

Innovation

Methods, ideas, or system contributions that make the work stand out.

Textual Parameter Graph

Textual Gradients

Group Relative Agent Optimization