Polymath: A Self-Optimizing Agent with Dynamic Hierarchical Workflow

📅 2025-08-04
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Building intelligent agents for real-world scenarios is hindered by scarce labeled data and dynamically complex tasks. Method: We propose a supervision-free, self-optimizing general-purpose agent centered on a dynamic hierarchical workflow architecture. This integrates code-based workflow modeling, task-flow graph representation, self-reflection mechanisms, and a multi-grid heuristic graph optimization–driven evolutionary algorithm to autonomously evolve workflow structures without supervision. Contribution/Results: Our approach eliminates reliance on annotated data, enabling task-adaptive behavior and continuous hierarchical process optimization. Evaluated across six benchmarks—including programming, mathematical reasoning, and multi-turn question answering—it achieves an average 8.1% improvement over state-of-the-art methods, significantly enhancing both solution efficiency and generalization capability on complex reasoning tasks.

Technology Category

Application Category

📝 Abstract
Large language models (LLMs) excel at solving complex tasks by executing agentic workflows composed of detailed instructions and structured operations. Yet, building general-purpose agents by manually embedding foundation models into agentic systems such as Chain-of-Thought, Self-Reflection, and ReACT through text interfaces limits scalability and efficiency. Recently, many researchers have sought to automate the generation and optimization of these workflows through code-based representations. However, existing methods often rely on labeled datasets to train and optimize workflows, making them ineffective and inflexible for solving real-world, dynamic problems where labeled data is unavailable. To address this challenge, we introduce Polymath, a self-optimizing agent with dynamic hierarchical workflow that leverages the flexibility of task flow graphs and the expressiveness of code-represented workflows to solve a wide range of real-world, dynamic problems. The proposed optimization methodology integrates multi-grid-inspired graph optimization with a self-reflection-guided evolutionary algorithm to refine workflows without labeled data. Experimental results on six benchmark datasets across coding, math, and multi-turn QA tasks show that Polymath achieves 8.1% average improvement over state-of-the-art baselines.
Problem

Research questions and friction points this paper is trying to address.

Automating agentic workflow generation without labeled data
Enhancing scalability in general-purpose LLM-based agents
Optimizing dynamic hierarchical workflows for real-world problems
Innovation

Methods, ideas, or system contributions that make the work stand out.

Dynamic hierarchical workflow for self-optimizing agent
Code-represented workflows with task flow graphs
Multi-grid graph optimization and evolutionary algorithm
C
Chia-Tung Ho
Nvidia, Santa Clara, CA, USA
J
Jing Gong
Nvidia, Santa Clara, CA, USA
Xufeng Yao
Xufeng Yao
The Chinese University of Hong Kong
Large Language ModelComputer VisionMachine Learning in EDA
Yunsheng Bai
Yunsheng Bai
University of California, Los Angeles
Graph Deep LearningRepresentation LearningDiscrete StructuresMachine Learning
A
Abhishek B Akkur
Nvidia, Santa Clara, CA, USA
H
Haoxing Ren
Nvidia, Santa Clara, CA, USA