HALO: Hierarchical Autonomous Logic-Oriented Orchestration for Multi-Agent LLM Systems

📅 2025-05-17
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Existing multi-agent systems (MAS) are constrained by predefined roles and static communication topologies, limiting adaptability to complex, specialized tasks. This paper proposes a hierarchical autonomous multi-agent framework: a high-level planner decomposes tasks; a mid-level module dynamically generates task-adaptive roles; and a low-level layer executes subtasks collaboratively via LLM-based agents. Subtasks are formalized as structured workflow search problems, solved via Monte Carlo Tree Search (MCTS)-guided reasoning trajectory exploration and adaptive prompt refinement—enabling task-driven role evolution and zero-shot prompt self-optimization. The three-tier collaborative architecture eliminates manual prompt engineering. Evaluated on HumanEval, MMLU, and MATH benchmarks, it achieves an average 14.4% improvement: +13.3% on MMLU moral reasoning and +19.6% on MATH algebra subtasks—significantly enhancing generalization and performance on expert-level tasks.

Technology Category

Application Category

📝 Abstract
Recent advancements in Multi-Agent Systems (MAS) powered by Large Language Models (LLMs) have demonstrated tremendous potential in diverse task scenarios. Nonetheless, existing agentic systems typically rely on predefined agent-role design spaces and static communication structures, limiting their adaptability as well as flexibility in complex interaction environments and leading to subpar performance on highly specialized and expert-level tasks. To address these issues, we introduce HALO, a multi-agent collaboration framework based on a hierarchical reasoning architecture. Specifically, we incorporate a high-level planning agent for task decomposition, mid-level role-design agents for subtask-specific agent instantiation, and low-level inference agents for subtask execution. Particularly, subtask execution is reformulated as a structured workflow search problem, where Monte Carlo Tree Search (MCTS) systematically explores the agentic action space to construct optimal reasoning trajectories. Additionally, as the majority of users lack expertise in prompt engineering, we leverage an Adaptive Prompt Refinement module to transform raw queries into task-specific prompts. Empirical evaluations on Code Generation (HumanEval), General Reasoning (MMLU), and Arithmetic Reasoning (MATH) benchmark datasets highlight the effectiveness of HALO, yielding a 14.4% average improvement over state-of-the-art baselines. Notably, HALO achieves up to 13.3% performance gain on the Moral Scenarios subject in the MMLU benchmark and up to 19.6% performance gain on the Algebra subarea in the MATH benchmark, indicating its advanced proficiency in tackling highly specialized and expert-level tasks. The code repository is available at https://github.com/23japhone/HALO.
Problem

Research questions and friction points this paper is trying to address.

Enhances adaptability in multi-agent LLM systems
Improves performance on expert-level specialized tasks
Automates prompt engineering for non-expert users
Innovation

Methods, ideas, or system contributions that make the work stand out.

Hierarchical reasoning architecture for multi-agent collaboration
Monte Carlo Tree Search for optimal reasoning trajectories
Adaptive Prompt Refinement for task-specific prompts