Adaptive Prompt Elicitation for Text-to-Image Generation

📅 2026-02-04

📈 Citations: 0

✨ Influential: 0

career value

195K/year

🤖 AI Summary

This work addresses the limitations of current text-to-image prompting methods when users struggle to articulate their visual intent due to ambiguous instructions or unfamiliarity with model capabilities. To overcome this, the authors propose Adaptive Prompt Elicitation (APE), a novel approach that integrates language model priors with an information-theoretic framework to dynamically generate interpretable visual queries. These queries guide users in iteratively refining their intentions while automatically compiling high-quality prompts. By moving beyond conventional text-only prompting paradigms, APE significantly improves alignment between user intent and generated outputs, as demonstrated on the IDEA-Bench and DesignBench benchmarks. User studies further reveal a 19.8% improvement in intent alignment for complex tasks without imposing additional cognitive load.

Technology Category

Application Category

📝 Abstract

Aligning text-to-image generation with user intent remains challenging, for users who provide ambiguous inputs and struggle with model idiosyncrasies. We propose Adaptive Prompt Elicitation (APE), a technique that adaptively asks visual queries to help users refine prompts without extensive writing. Our technical contribution is a formulation of interactive intent inference under an information-theoretic framework. APE represents latent intent as interpretable feature requirements using language model priors, adaptively generates visual queries, and compiles elicited requirements into effective prompts. Evaluation on IDEA-Bench and DesignBench shows that APE achieves stronger alignment with improved efficiency. A user study with challenging user-defined tasks demonstrates 19.8% higher alignment without workload overhead. Our work contributes a principled approach to prompting that, for general users, offers an effective and efficient complement to the prevailing prompt-based interaction paradigm with text-to-image models.

Problem

Research questions and friction points this paper is trying to address.

text-to-image generation

user intent alignment

ambiguous prompts

interactive prompting

prompt refinement

Innovation

Methods, ideas, or system contributions that make the work stand out.

Adaptive Prompt Elicitation

intent inference

information-theoretic framework