🤖 AI Summary
Existing prompt engineering methods for input-sensitive tasks like machine translation predominantly optimize instructions rather than critical input components, resulting in limited generalizability and efficiency.
Method: We propose the first lightweight, input-focused prompt rewriting framework: an end-to-end input rewriting network built upon a small-parameter language model (0.1B), trained via a novel back-translation–driven self-supervised objective to automatically refine source-language inputs. Crucially, no large language model is involved in training, significantly reducing computational overhead.
Contribution/Results: Our approach achieves performance gains comparable to large-model–assisted prompt engineering—+2.3 BLEU on average—across multiple machine translation benchmarks. It introduces minimal parameter overhead while maintaining strong transferability to other input-dependent downstream tasks, offering an efficient, scalable alternative to instruction-centric prompting paradigms.
📝 Abstract
In recent years, the growing interest in Large Language Models (LLMs) has significantly advanced prompt engineering, transitioning from manual design to model-based optimization. Prompts for LLMs generally comprise two components: the extit{instruction}, which defines the task or objective, and the extit{input}, which is tailored to the instruction type. In natural language generation (NLG) tasks such as machine translation, the extit{input} component is particularly critical, while the extit{instruction} component tends to be concise. Existing prompt engineering methods primarily focus on optimizing the extit{instruction} component for general tasks, often requiring large-parameter LLMs as auxiliary tools. However, these approaches exhibit limited applicability for tasks like machine translation, where the extit{input} component plays a more pivotal role. To address this limitation, this paper introduces a novel prompt optimization method specifically designed for machine translation tasks. The proposed approach employs a small-parameter model trained using a back-translation-based strategy, significantly reducing training overhead for single-task optimization while delivering highly effective performance. With certain adaptations, this method can also be extended to other downstream tasks.