DelvePO: Direction-Guided Self-Evolving Framework for Flexible Prompt Optimization

📅 2025-10-20
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Existing prompt optimization methods rely on LLM-driven stochastic rewriting, suffering from premature convergence to local optima, unstable performance, and poor cross-task transferability. This paper proposes DelvePO, a task-agnostic prompt optimization framework based on self-evolution. Its core innovation is a novel direction-guided self-evolution mechanism: it decouples prompt components to enable interpretable analysis of influencing factors, and incorporates a working memory module to mitigate model uncertainty—thereby enhancing optimization stability and generalization. DelvePO is compatible with both open-source (e.g., Llama, Qwen) and closed-source (e.g., GPT series) LLMs. Extensive experiments across diverse domains demonstrate significant improvements over state-of-the-art methods. Empirical validation on DeepSeek, Qwen2.5, and GPT-4o-mini confirms its strong effectiveness and broad transferability across architectures and tasks.

Technology Category

Application Category

📝 Abstract
Prompt Optimization has emerged as a crucial approach due to its capabilities in steering Large Language Models to solve various tasks. However, current works mainly rely on the random rewriting ability of LLMs, and the optimization process generally focus on specific influencing factors, which makes it easy to fall into local optimum. Besides, the performance of the optimized prompt is often unstable, which limits its transferability in different tasks. To address the above challenges, we propose $ extbf{DelvePO}$ ($ extbf{D}$irection-Guid$ extbf{e}$d Se$ extbf{l}$f-E$ extbf{v}$olving Framework for Fl$ extbf{e}$xible $ extbf{P}$rompt $ extbf{O}$ptimization), a task-agnostic framework to optimize prompts in self-evolve manner. In our framework, we decouple prompts into different components that can be used to explore the impact that different factors may have on various tasks. On this basis, we introduce working memory, through which LLMs can alleviate the deficiencies caused by their own uncertainties and further obtain key insights to guide the generation of new prompts. Extensive experiments conducted on different tasks covering various domains for both open- and closed-source LLMs, including DeepSeek-R1-Distill-Llama-8B, Qwen2.5-7B-Instruct and GPT-4o-mini. Experimental results show that DelvePO consistently outperforms previous SOTA methods under identical experimental settings, demonstrating its effectiveness and transferability across different tasks.
Problem

Research questions and friction points this paper is trying to address.

Optimizing prompts to avoid local optima in LLMs
Enhancing prompt stability for better task transferability
Developing task-agnostic self-evolving prompt optimization framework
Innovation

Methods, ideas, or system contributions that make the work stand out.

Direction-guided self-evolving framework for prompt optimization
Decouples prompts into components to explore factor impacts
Uses working memory to guide new prompt generation
🔎 Similar Papers
No similar papers found.
Tao Tao
Tao Tao
University of Maryland
G
Guanghui Zhu
State Key Laboratory for Novel Software Technology, Nanjing University
L
Lang Guo
State Key Laboratory for Novel Software Technology, Nanjing University
H
Hongyi Chen
State Key Laboratory for Novel Software Technology, Nanjing University
Chunfeng Yuan
Chunfeng Yuan
National Laboratory of Pattern Recognition, Institute of Automation, Chinese Academy of Sciences
computer visionPattern RecognitionMachine LearningHuman Action RecognitionSparse Representation
Y
Yihua Huang
State Key Laboratory for Novel Software Technology, Nanjing University