Polymath: A Self-Optimizing Agent with Dynamic Hierarchical Workflow

📅 2025-08-04
📈 Citations: 0
Influential: 0
📄 PDF

career value

202K/year
🤖 AI Summary
Building intelligent agents for real-world scenarios is hindered by scarce labeled data and dynamically complex tasks. Method: We propose a supervision-free, self-optimizing general-purpose agent centered on a dynamic hierarchical workflow architecture. This integrates code-based workflow modeling, task-flow graph representation, self-reflection mechanisms, and a multi-grid heuristic graph optimization–driven evolutionary algorithm to autonomously evolve workflow structures without supervision. Contribution/Results: Our approach eliminates reliance on annotated data, enabling task-adaptive behavior and continuous hierarchical process optimization. Evaluated across six benchmarks—including programming, mathematical reasoning, and multi-turn question answering—it achieves an average 8.1% improvement over state-of-the-art methods, significantly enhancing both solution efficiency and generalization capability on complex reasoning tasks.

Technology Category

Application Category

📝 Abstract
Large language models (LLMs) excel at solving complex tasks by executing agentic workflows composed of detailed instructions and structured operations. Yet, building general-purpose agents by manually embedding foundation models into agentic systems such as Chain-of-Thought, Self-Reflection, and ReACT through text interfaces limits scalability and efficiency. Recently, many researchers have sought to automate the generation and optimization of these workflows through code-based representations. However, existing methods often rely on labeled datasets to train and optimize workflows, making them ineffective and inflexible for solving real-world, dynamic problems where labeled data is unavailable. To address this challenge, we introduce Polymath, a self-optimizing agent with dynamic hierarchical workflow that leverages the flexibility of task flow graphs and the expressiveness of code-represented workflows to solve a wide range of real-world, dynamic problems. The proposed optimization methodology integrates multi-grid-inspired graph optimization with a self-reflection-guided evolutionary algorithm to refine workflows without labeled data. Experimental results on six benchmark datasets across coding, math, and multi-turn QA tasks show that Polymath achieves 8.1% average improvement over state-of-the-art baselines.
Problem

Research questions and friction points this paper is trying to address.

Automating agentic workflow generation without labeled data
Enhancing scalability in general-purpose LLM-based agents
Optimizing dynamic hierarchical workflows for real-world problems
Innovation

Methods, ideas, or system contributions that make the work stand out.

Dynamic hierarchical workflow for self-optimizing agent
Code-represented workflows with task flow graphs
Multi-grid graph optimization and evolutionary algorithm