๐ค AI Summary
Existing LLM prompting optimization methods suffer from information loss and poor robustness under long-prompt scenarios. This paper proposes TreePrompt, a novel framework that models prompts as hierarchical tree structures and introduces a Critic-Actor collaborative architecture. It enables context-aware, targeted prompt refinement via reflection-driven hierarchical optimization. TreePrompt pioneers a tree-based prompt representation, supporting zero-shot initializationโi.e., high-performance prompts can be generated without any human-written initial prompt. Extensive experiments demonstrate consistent superiority over state-of-the-art methods across multiple tasks; significant improvements in robustness and cross-task generalization; strong resistance to adversarial perturbations; and validated stability and interpretability through both qualitative and quantitative analyses.
๐ Abstract
Prompt optimization is essential for effective utilization of large language models (LLMs) across diverse tasks. While existing optimization methods are effective in optimizing short prompts, they struggle with longer, more complex ones, often risking information loss and being sensitive to small perturbations. To address these challenges, we propose SCULPT (Systematic Tuning of Long Prompts), a framework that treats prompt optimization as a hierarchical tree refinement problem. SCULPT represents prompts as tree structures, enabling targeted modifications while preserving contextual integrity. It employs a Critic-Actor framework that generates reflections and applies actions to refine the prompt. Evaluations demonstrate SCULPT's effectiveness on long prompts, its robustness to adversarial perturbations, and its ability to generate high-performing prompts even without any initial human-written prompt. Compared to existing state of the art methods, SCULPT consistently improves LLM performance by preserving essential task information while applying structured refinements. Both qualitative and quantitative analyses show that SCULPT produces more stable and interpretable prompt modifications, ensuring better generalization across tasks.