🤖 AI Summary
This study investigates the impact of prompt repetition on the performance of mainstream large language models (Gemini, GPT, Claude, DeepSeek) in non-reasoning tasks. Through systematic API-based ablation experiments across multiple models, diverse tasks (text classification, information extraction, machine translation), and varying temperature settings, we find that repeating the input prompt—without increasing output token count or response latency—yields consistent accuracy gains of 3.2–7.8% on average. To our knowledge, this is the first work to empirically demonstrate a universal performance benefit of prompt repetition in non-reasoning settings, challenging the prevailing assumption that redundant prompts are inherently detrimental. We propose prompt repetition as a lightweight, zero-compute-overhead prompting optimization paradigm that requires no model fine-tuning, architectural modification, or additional inference cost. This finding introduces a novel, practical direction for efficient prompt engineering in production LLM applications.
📝 Abstract
When not using reasoning, repeating the input prompt improves performance for popular models (Gemini, GPT, Claude, and Deepseek) without increasing the number of generated tokens or latency.