🤖 AI Summary
This work proposes DiscoverLLM, a framework designed to address the challenge of clarifying ambiguous or open-ended user requests whose underlying intents are often initially unspecified. By modeling users’ cognitive states through a hierarchical intent structure and leveraging a reinforcement learning–driven user simulator—where reward signals reflect the degree of intent concretization—the framework trains large language models to dynamically balance exploration and refinement during interaction, proactively guiding users toward intent discovery and clarification. Integrating hierarchical intent modeling with interactive alignment training, DiscoverLLM achieves performance improvements exceeding 10% across creative writing, technical writing, and SVG drawing tasks, while reducing dialogue length by up to 40%. A user study involving 75 participants demonstrates its significant superiority over baseline methods in both efficiency and user satisfaction.
📝 Abstract
To handle ambiguous and open-ended requests, Large Language Models (LLMs) are increasingly trained to interact with users to surface intents they have not yet expressed (e.g., ask clarification questions). However, users are often ambiguous because they have not yet formed their intents: they must observe and explore outcomes to discover what they want. Simply asking"what kind of tone do you want?"fails when users themselves do not know. We introduce DiscoverLLM, a novel and generalizable framework that trains LLMs to help users form and discover their intents. Central to our approach is a novel user simulator that models cognitive state with a hierarchy of intents that progressively concretize as the model surfaces relevant options -- where the degree of concretization serves as a reward signal that models can be trained to optimize. Resulting models learn to collaborate with users by adaptively diverging (i.e., explore options) when intents are unclear, and converging (i.e., refine and implement) when intents concretize. Across proposed interactive benchmarks in creative writing, technical writing, and SVG drawing, DiscoverLLM achieves over 10% higher task performance while reducing conversation length by up to 40%. In a user study with 75 human participants, DiscoverLLM improved conversation satisfaction and efficiency compared to baselines.