Generative Prompt Internalization

📅 2024-11-24
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address the high computational overhead caused by fixed-length prompts in large language model (LLM) applications, this paper proposes *Prompt Internalization*—a method that distills complex prompting behaviors into the model’s intrinsic capabilities, enabling prompt-free, efficient inference. Our approach comprises two key innovations: (1) a generative prompt internalization mechanism that jointly models prompt content and behavioral rationale; and (2) a role-swapping dialogue data synthesis technique that automatically generates high-quality training data using only the original prompt—requiring no human annotation. Evaluated in multi-task agent scenarios, our method reduces inference latency by 37% on average and decreases GPU memory consumption by 42%, while preserving or improving task performance. Crucially, it eliminates runtime prompt dependency entirely, enabling lightweight, zero-shot deployment without prompt engineering.

Technology Category

Application Category

📝 Abstract
Prompts used in recent large language model based applications are often fixed and lengthy, leading to significant computational overhead. To address this challenge, we propose Generative Prompt Internalization (GenPI), a lightweight method that employs a joint training approach. GenPI not only replicates the behavior of models with prompt inputs but also generates the content of the prompt along with reasons for why the model's behavior should change accordingly. We demonstrate that our approach effectively internalizes complex prompts across various agent-based application scenarios. For effective training without interactions with the dedicated environments, we introduce a data synthesis technique that autonomously collects conversational datasets by swapping the roles of the agent and environment. This method is especially useful in scenarios where only a predefined prompt is available without a corresponding training dataset. By internalizing complex prompts, Generative Prompt Internalization enables high performance and efficient inference without the need for explicit prompts.
Problem

Research questions and friction points this paper is trying to address.

Reduces computational overhead
Internalizes complex prompts
Autonomous data synthesis technique
Innovation

Methods, ideas, or system contributions that make the work stand out.

Generative Prompt Internalization method
Joint training approach employed
Data synthesis technique introduced
🔎 Similar Papers
No similar papers found.