🤖 AI Summary
Users frequently encounter inefficient LLM responses due to ambiguous or poorly formulated prompts. This paper systematically investigates prompt failure in real-world human–LLM interactions and proposes an intent-aware prompt rewriting method grounded in LLMs: it preserves user intent while enhancing prompts via context-augmented, multi-model collaborative regeneration to improve response quality. Its key contributions are threefold: (1) the first empirical, LLM-centric study of prompt understanding in authentic dialogues; (2) empirical evidence that LLMs possess robust intent inference capabilities; and (3) validation that context-aware rewriting yields substantial gains—particularly in long, multi-turn dialogues. The method is rigorously evaluated across diverse LLM families and sizes, demonstrating consistent, significant improvements in response usefulness and accuracy under cross-domain, cross-intent, and cross-model settings. It establishes a scalable, generalizable paradigm for optimizing human–LLM interaction. (149 words)
📝 Abstract
Human-LLM conversations are increasingly becoming more pervasive in peoples' professional and personal lives, yet many users still struggle to elicit helpful responses from LLM Chatbots. One of the reasons for this issue is users' lack of understanding in crafting effective prompts that accurately convey their information needs. Meanwhile, the existence of real-world conversational datasets on the one hand, and the text understanding faculties of LLMs on the other, present a unique opportunity to study this problem, and its potential solutions at scale. Thus, in this paper we present the first LLM-centric study of real human-AI chatbot conversations, focused on investigating aspects in which user queries fall short of expressing information needs, and the potential of using LLMs to rewrite suboptimal user prompts. Our findings demonstrate that rephrasing ineffective prompts can elicit better responses from a conversational system, while preserving the user's original intent. Notably, the performance of rewrites improves in longer conversations, where contextual inferences about user needs can be made more accurately. Additionally, we observe that LLMs often need to -- and inherently do -- make emph{plausible} assumptions about a user's intentions and goals when interpreting prompts. Our findings largely hold true across conversational domains, user intents, and LLMs of varying sizes and families, indicating the promise of using prompt rewriting as a solution for better human-AI interactions.