🤖 AI Summary
Existing prompt optimization methods rely on large language models (e.g., GPT-4) to generate lengthy, complex prompts online, rendering them ill-suited for lightweight inference models and often degrading performance. To address this, we propose MePO—a lightweight, locally deployable prompt optimizer that eliminates dependence on large models. Our key contributions are threefold: (1) We formally define model-agnostic, interpretable prompt quality criteria for the first time; (2) We construct a criterion-aligned preference dataset; and (3) We integrate preference learning with lightweight LLM distillation to enable zero-large-model-dependency prompt optimization. Extensive experiments demonstrate that MePO significantly outperforms GPT-4–driven approaches across diverse tasks and models—including resource-constrained ones—while reducing computational overhead and mitigating privacy risks. The model and dataset are publicly released.
📝 Abstract
Prompt optimization (PO) offers a practical alternative to fine-tuning large language models (LLMs), enabling performance improvements without altering model weights. Existing methods typically rely on advanced, large-scale LLMs like GPT-4 to generate optimized prompts. However, due to limited downward compatibility, verbose, instruction-heavy prompts from advanced LLMs can overwhelm lightweight inference models and degrade response quality. In this work, we rethink prompt optimization through the lens of interpretable design. We first identify a set of model-agnostic prompt quality merits and empirically validate their effectiveness in enhancing prompt and response quality. We then introduce MePO, a merit-guided, lightweight, and locally deployable prompt optimizer trained on our preference dataset built from merit-aligned prompts generated by a lightweight LLM. Unlike prior work, MePO avoids online optimization reliance, reduces cost and privacy concerns, and, by learning clear, interpretable merits, generalizes effectively to both large-scale and lightweight inference models. Experiments demonstrate that MePO achieves better results across diverse tasks and model types, offering a scalable and robust solution for real-world deployment. Our model and dataset are available at: https://github.com/MidiyaZhu/MePO