Exploring Visual Prompts: Refining Images with Scribbles and Annotations in Generative AI Image Tools

📅 2025-03-05

📈 Citations: 0

✨ Influential: 0

career value

162K/year

🤖 AI Summary

Designers struggle to effectively integrate text prompts, annotations, and scribbles—three distinct input modalities—during the image refinement phase in generative AI image tools. Method: We conducted the first systematic comparative study with seven professional designers using a digital-paper prototype, combining contextual interviews and task-based behavioral observation to analyze input strategies, cognitive load, and AI misinterpretation patterns. Results: We identify clear functional boundaries and complementary synergies among modalities: annotations excel at spatial referencing and element identification; scribbles support precise shape and positional control; text prompts best stimulate semantic creativity. Key bottlenecks include frequent AI misinterpretation of visual cues and high cognitive cost in crafting effective text prompts. Grounded in empirical evidence, we propose multimodal prompt design principles that balance expressivity, interpretability, and efficiency—offering both theoretical foundations and practical guidelines for next-generation GenAI design tool interaction paradigms.

Technology Category

Application Category

📝 Abstract

Generative AI (GenAI) tools are increasingly integrated into design workflows. While text prompts remain the primary input method for GenAI image tools, designers often struggle to craft effective ones. Moreover, research has primarily focused on input methods for ideation, with limited attention to refinement tasks. This study explores designers' preferences for three input methods - text prompts, annotations, and scribbles - through a preliminary digital paper-based study with seven professional designers. Designers preferred annotations for spatial adjustments and referencing in-image elements, while scribbles were favored for specifying attributes such as shape, size, and position, often combined with other methods. Text prompts excelled at providing detailed descriptions or when designers sought greater GenAI creativity. However, designers expressed concerns about AI misinterpreting annotations and scribbles and the effort needed to create effective text prompts. These insights inform GenAI interface design to better support refinement tasks, align with workflows, and enhance communication with AI systems.

Problem

Research questions and friction points this paper is trying to address.

Exploring input methods for refining images in GenAI tools

Comparing text prompts, annotations, and scribbles for design refinement

Addressing challenges in AI interpretation and effective prompt creation

Innovation

Methods, ideas, or system contributions that make the work stand out.

Explores text, annotations, scribbles for GenAI inputs

Annotations preferred for spatial adjustments, in-image references

Scribbles used for specifying shape, size, position attributes

🔎 Similar Papers

No similar papers found.