Varif.ai to Vary and Verify User-Driven Diversity in Scalable Image Generation

📅 2025-06-24

📈 Citations: 0

✨ Influential: 0

career value

205K/year

🤖 AI Summary

In text-to-image generation, diversity degrades significantly over iterative sampling, and existing methods lack fine-grained, user-controllable diversity modulation—especially for specific attributes (e.g., color, brand)—hindering creative ideation and equitable representation. To address this, we propose a “Generate–Verify–Mutate” closed-loop framework that synergistically integrates multimodal generative models with large language models. We design an attribute-coverage verification algorithm and a dynamic prompt mutation strategy to enable automated, user-driven diversity assessment and enhancement. Our approach is the first to support scalable, interpretable, and attribute-level customization of diversity objectives. User studies confirm broad demand for such control; ablation and comparative experiments demonstrate statistically significant improvements over baselines in diversity coverage, semantic fidelity, and generation efficiency—consistently enhancing the quality of diverse image sets across multiple application scenarios.

Technology Category

Application Category

📝 Abstract

Diversity in image generation is essential to ensure fair representations and support creativity in ideation. Hence, many text-to-image models have implemented diversification mechanisms. Yet, after a few iterations of generation, a lack of diversity becomes apparent, because each user has their own diversity goals (e.g., different colors, brands of cars), and there are diverse attributions to be specified. To support user-driven diversity control, we propose Varif.ai that employs text-to-image and Large Language Models to iteratively i) (re)generate a set of images, ii) verify if user-specified attributes have sufficient coverage, and iii) vary existing or new attributes. Through an elicitation study, we uncovered user needs for diversity in image generation. A pilot validation showed that Varif.ai made achieving diverse image sets easier. In a controlled evaluation with 20 participants, Varif.ai proved more effective than baseline methods across various scenarios. Thus, this supports user control of diversity in image generation for creative ideation and scalable image generation.

Problem

Research questions and friction points this paper is trying to address.

Ensures user-driven diversity in scalable image generation

Addresses lack of diversity in iterative image generation

Verifies and varies attributes for diverse image sets

Innovation

Methods, ideas, or system contributions that make the work stand out.

Uses text-to-image and LLMs for iterative generation

Verifies coverage of user-specified attributes

Varies attributes to enhance diversity

🔎 Similar Papers

Tackling copyright issues in AI image generation through originality estimation and genericization