TextReg: Mitigating Prompt Distributional Overfitting via Regularized Text-Space Optimization

📅 2026-05-20

📈 Citations: 0

✨ Influential: 0

career value

158K/year

🤖 AI Summary

This work addresses the tendency of large language models to produce verbose and poorly generalizing prompts during prompt optimization, often due to distributional overfitting. To mitigate this issue, the authors propose TextReg, a framework that enhances representational efficiency and out-of-distribution generalization through a dual-factor metric of representational inefficiency. TextReg integrates regularized text-space optimization with three key components: dual-evidence gradient purification, semantic editing regularization, and guided prompt updating. Experimental results demonstrate that TextReg significantly outperforms existing methods across multiple reasoning benchmarks, achieving accuracy improvements of up to 11.8% over TextGrad and 16.5% over REVOLVE.

📝 Abstract

Large language models (LLMs) are highly sensitive to the prompts used to specify task objectives and behavioral constraints. Many recent prompt optimization methods iteratively rewrite prompts using LLM-generated feedback, but the resulting prompts often become longer, accumulate narrow sample-specific rules, and generalize poorly beyond the training distribution. We study this failure mode as prompt distributional overfitting and argue that it reflects a lack of representation control in discrete text-space optimization. We formalize this view through representational inefficiency, a dual-factor measure that decomposes prompt inefficiency into capacity cost and scope narrowness, attributing distributional prompt overfitting to their coupled growth during optimization. We propose TextReg, a regularization framework that realizes a soft-penalty objective through regularized textual gradients, combining Dual-Evidence Gradient Purification, Semantic Edit Regularization, and Regularization-Guided Prompt Update. Across multiple reasoning benchmarks, TextReg substantially improves out-of-distribution (OOD) generalization, with accuracy gains of up to +11.8% over TextGrad and +16.5% over REVOLVE.

Problem

Research questions and friction points this paper is trying to address.

prompt distributional overfitting

out-of-distribution generalization

text-space optimization

representational inefficiency

prompt optimization

Innovation

Methods, ideas, or system contributions that make the work stand out.

prompt distributional overfitting

text-space optimization

regularized textual gradients