๐ค AI Summary
Existing prompt optimization methods rely on manual design or supervised training, suffering from poor generalizability and strong task coupling. To address this, we propose the Hierarchical Multi-Agent Workflow (HMAW), a zero-shot, task-agnostic, human-free, and model-training-free framework for autonomous prompt generation. HMAW employs coordinated large language model (LLM) agentsโeach assuming distinct rolesโto jointly perform instruction refinement, semantic alignment, and dynamic feedback, enabling end-to-end high-quality prompt construction without external annotations or domain-specific priors. Evaluated across multiple benchmarks, HMAW significantly improves LLM answer accuracy while generating more comprehensive and context-adaptive prompts. This work establishes the first fully autonomous, hierarchically collaborative zero-shot prompt optimization paradigm, overcoming fundamental limitations of conventional manual prompt engineering and supervised fine-tuning approaches.
๐ Abstract
Large language models (LLMs) have shown great progress in responding to user questions, allowing for a multitude of diverse applications. Yet, the quality of LLM outputs heavily depends on the prompt design, where a good prompt might enable the LLM to answer a very challenging question correctly. Therefore, recent works have developed many strategies for improving the prompt, including both manual crafting and in-domain optimization. However, their efficacy in unrestricted scenarios remains questionable, as the former depends on human design for specific questions and the latter usually generalizes poorly to unseen scenarios. To address these problems, we give LLMs the freedom to design the best prompts according to themselves. Specifically, we include a hierarchy of LLMs, first constructing a prompt with precise instructions and accurate wording in a hierarchical manner, and then using this prompt to generate the final answer to the user query. We term this pipeline Hierarchical Multi-Agent Workflow, or HMAW. In contrast with prior works, HMAW imposes no human restriction and requires no training, and is completely task-agnostic while capable of adjusting to the nuances of the underlying task. Through both quantitative and qualitative experiments across multiple benchmarks, we verify that despite its simplicity, the proposed approach can create detailed and suitable prompts, further boosting the performance of current LLMs.