Beyond Magic Words: Sharpness-Aware Prompt Evolving for Robust Large Language Models with TARE

📅 2025-09-28

📈 Citations: 0

✨ Influential: 0

career value

193K/year

🤖 AI Summary

Large language model (LLM) prompts exhibit high sensitivity to semantically equivalent paraphrasing, leading to substantial performance fluctuations; existing methods optimize only pointwise accuracy, neglecting robustness in the discrete semantic space. Method: This work formally defines prompt “sharpness” and introduces a semantic neighborhood robustness criterion. We propose TARE, a black-box optimization framework featuring anisotropic weighting and dynamic radius adjustment, integrated with derivative-free adversarial sampling and robust selection. We further extend it to ATARE, enabling learnable neighborhood structure. Contribution/Results: Across diverse tasks, our approach significantly improves prompt stability against synonymous paraphrases while maintaining tractable computational overhead. It consistently outperforms accuracy-only prompt search baselines, establishing a new state of the art in robust prompt engineering.

Technology Category

Application Category

📝 Abstract

The performance of Large Language Models (LLMs) hinges on carefully engineered prompts. However, prevailing prompt optimization methods, ranging from heuristic edits and reinforcement learning to evolutionary search, primarily target point-wise accuracy. They seldom enforce paraphrase invariance or searching stability, and therefore cannot remedy this brittleness in practice. Automated prompt search remains brittle: small, semantically preserving paraphrases often cause large performance swings. We identify this brittleness as the textual sharpness of the prompt landscape. In this work, we provide the first formal treatment of textual sharpness in the discrete, semantic space of prompts, together with an operational robustness criterion over a semantic neighborhood; the design is black-box or API-only, requiring no gradients to update the model's parameters. Then we introduce TARE (Textual Sharpness-Aware Evolving), a derivative-free framework that alternates between an inner, sampling-based adversarial search that stresses a prompt with hard paraphrases and an outer, robust selection that prefers candidates whose neighborhoods remain strong. We further propose ATARE, which learns anisotropic weights to shape the semantic neighborhood and adapts its radius over time to balance exploration and fidelity. Diverse tasks evaluate our methods, whose design for minimizing textual sharpness gap leads to prompts that preserve accuracy under paraphrasing, outperforming accuracy-only prompt search while remaining computationally practical.

Problem

Research questions and friction points this paper is trying to address.

Optimizing prompt robustness against semantic paraphrasing in LLMs

Reducing performance instability caused by small prompt variations

Developing gradient-free methods for sharpness-aware prompt evolution

Innovation

Methods, ideas, or system contributions that make the work stand out.

Sharpness-Aware Prompt Evolving framework TARE

Derivative-free adversarial search with hard paraphrases

Anisotropic neighborhood shaping for robustness

🔎 Similar Papers

Evaluating Concurrent Robustness of Language Models Across Diverse Challenge Sets