ROMA: Recursive Open Meta-Agent Framework for Long-Horizon Multi-Agent Systems

📅 2026-02-02

📈 Citations: 0

✨ Influential: 0

career value

212K/year

🤖 AI Summary

This work addresses the limitations of existing agent frameworks in long-horizon tasks, which are constrained by shallow reasoning depth, limited context windows, and opaque execution. To overcome these challenges, the authors propose a modular four-role architecture—comprising an Atomizer, Planner, Executor, and Aggregator—that recursively decomposes tasks into a dependency-aware subtask tree for parallel execution. Structured aggregation and intermediate result compression are employed to control context growth. The framework further introduces the novel GEPA+ prompting strategy, which decouples orchestration logic from underlying models and enables heterogeneous multi-agent collaboration. Experimental results demonstrate a 9.9% accuracy improvement on SEAL-0 and show that DeepSeek-V3, when enhanced with this approach, matches the performance of Claude Sonnet 4.5 on EQ-Bench, significantly advancing long-horizon reasoning and generation capabilities.

Technology Category

Application Category

📝 Abstract

Current agentic frameworks underperform on long-horizon tasks. As reasoning depth increases, sequential orchestration becomes brittle, context windows impose hard limits that degrade performance, and opaque execution traces make failures difficult to localize or debug. We introduce ROMA (Recursive Open Meta-Agents), a domain-agnostic framework that addresses these limitations through recursive task decomposition and structured aggregation. ROMA decomposes goals into dependency-aware subtask trees that can be executed in parallel, while aggregation compresses and validates intermediate results to control context growth. Our framework standardizes agent construction around four modular roles --Atomizer (which decides whether a task should be decomposed), Planner, Executor, and Aggregator -- which cleanly separate orchestration from model selection and enable transparent, hierarchical execution traces. This design supports heterogeneous multi-agent systems that mix models and tools according to cost, latency, and capability. To adapt ROMA to specific tasks without fine-tuning, we further introduce GEPA$+$, an improved Genetic-Pareto prompt proposer that searches over prompts within ROMA's component hierarchy while preserving interface contracts. We show that ROMA, combined with GEPA+, delivers leading system-level performance on reasoning and long-form generation benchmarks. On SEAL-0, which evaluates reasoning over conflicting web evidence, ROMA instantiated with GLM-4.6 improves accuracy by 9.9\% over Kimi-Researcher. On EQ-Bench, a long-form writing benchmark, ROMA enables DeepSeek-V3 to match the performance of leading closed-source models such as Claude Sonnet 4.5. Our results demonstrate that recursive, modular agent architectures can scale reasoning depth while remaining interpretable, flexible, and model-agnostic.

Problem

Research questions and friction points this paper is trying to address.

long-horizon tasks

sequential orchestration

context window limits

execution trace opacity

multi-agent systems

Innovation

Methods, ideas, or system contributions that make the work stand out.

Recursive Task Decomposition

Modular Multi-Agent Framework

Context-Aware Aggregation