🤖 AI Summary
Current text-to-image (T2I) models struggle to support intentional, nonlinear, and highly divergent exploration in visual design. To address this, we propose MindGen: a structured T2I interaction framework designed for creative exploration. Our approach integrates LLM-driven prompt engineering, T2I generation, and interactive visualization. Key contributions include: (1) a novel structured intent-input mechanism and an LLM-based prompt expansion method enabling controllable diversity; and (2) a mind-map–inspired, nonlinear interaction paradigm aligned with designers’ iterative and branching workflows. A user study demonstrates that MindGen significantly enhances prompt diversity—increasing the number of prompts attempted per unit time by 62%—and substantially improves user satisfaction compared to conventional T2I interfaces.
📝 Abstract
Broad exploration of references is critical in the visual design process. While text-to-image (T2I) models offer efficiency and customization of exploration, they often limit support for divergence in exploration. We conducted a formative study (N=6) to investigate the limitations of current interaction with the T2I model for broad exploration and found that designers struggle to articulate exploratory intentions and manage iterative, non-linear workflows. To address these challenges, we developed Expandora. Users can specify their exploratory intentions and desired diversity levels through structured input, and using an LLM-based pipeline, Expandora generates tailored prompt variations. The results are displayed in a mindmap-like interface that encourages non-linear workflows. A user study (N=8) demonstrated that Expandora significantly increases prompt diversity, the number of prompts users tried within a given time, and user satisfaction compared to the baseline. Nonetheless, its limitations in supporting convergent thinking suggest opportunities for holistically improving creative processes.