ROMA: Recursive Open Meta-Agent Framework for Long-Horizon Multi-Agent Systems

📅 2026-02-02
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the limitations of existing agent frameworks in long-horizon tasks, which are constrained by shallow reasoning depth, limited context windows, and opaque execution. To overcome these challenges, the authors propose a modular four-role architecture—comprising an Atomizer, Planner, Executor, and Aggregator—that recursively decomposes tasks into a dependency-aware subtask tree for parallel execution. Structured aggregation and intermediate result compression are employed to control context growth. The framework further introduces the novel GEPA+ prompting strategy, which decouples orchestration logic from underlying models and enables heterogeneous multi-agent collaboration. Experimental results demonstrate a 9.9% accuracy improvement on SEAL-0 and show that DeepSeek-V3, when enhanced with this approach, matches the performance of Claude Sonnet 4.5 on EQ-Bench, significantly advancing long-horizon reasoning and generation capabilities.

Technology Category

Application Category

📝 Abstract
Current agentic frameworks underperform on long-horizon tasks. As reasoning depth increases, sequential orchestration becomes brittle, context windows impose hard limits that degrade performance, and opaque execution traces make failures difficult to localize or debug. We introduce ROMA (Recursive Open Meta-Agents), a domain-agnostic framework that addresses these limitations through recursive task decomposition and structured aggregation. ROMA decomposes goals into dependency-aware subtask trees that can be executed in parallel, while aggregation compresses and validates intermediate results to control context growth. Our framework standardizes agent construction around four modular roles --Atomizer (which decides whether a task should be decomposed), Planner, Executor, and Aggregator -- which cleanly separate orchestration from model selection and enable transparent, hierarchical execution traces. This design supports heterogeneous multi-agent systems that mix models and tools according to cost, latency, and capability. To adapt ROMA to specific tasks without fine-tuning, we further introduce GEPA$+$, an improved Genetic-Pareto prompt proposer that searches over prompts within ROMA's component hierarchy while preserving interface contracts. We show that ROMA, combined with GEPA+, delivers leading system-level performance on reasoning and long-form generation benchmarks. On SEAL-0, which evaluates reasoning over conflicting web evidence, ROMA instantiated with GLM-4.6 improves accuracy by 9.9\% over Kimi-Researcher. On EQ-Bench, a long-form writing benchmark, ROMA enables DeepSeek-V3 to match the performance of leading closed-source models such as Claude Sonnet 4.5. Our results demonstrate that recursive, modular agent architectures can scale reasoning depth while remaining interpretable, flexible, and model-agnostic.
Problem

Research questions and friction points this paper is trying to address.

long-horizon tasks
sequential orchestration
context window limits
execution trace opacity
multi-agent systems
Innovation

Methods, ideas, or system contributions that make the work stand out.

Recursive Task Decomposition
Modular Multi-Agent Framework
Context-Aware Aggregation
Heterogeneous Agent Systems
Prompt Optimization
🔎 Similar Papers
No similar papers found.
S
Salaheddin Alzu'bi
Sentient
B
Baran Nama
Sentient
A
Arda Kaz
Sentient, UC Berkeley
A
Anushri Eswaran
Sentient, UC San Diego
Weiyuan Chen
Weiyuan Chen
Virginia Tech
Natural Language Processing
S
Sarvesh Khetan
Sentient, University of Maryland
R
Rishab Bala
Virginia Tech
Tu Vu
Tu Vu
Research Scientist, Google DeepMind; Assistant Professor, Virginia Tech
Natural Language ProcessingLarge Language ModelsTransfer Learning
S
Sewoong Oh
Sentient