Evolve to Inspire: Novelty Search for Diverse Image Generation

📅 2025-11-01

📈 Citations: 0

✨ Influential: 0

career value

184K/year

🤖 AI Summary

Text-to-image diffusion models often suffer from limited output diversity, hindering their utility in creative exploration tasks. To address this, we propose WANDER—a novel framework that introduces novelty search to text-to-image generation for the first time. WANDER employs a *launcher* mechanism to steer the model toward distinct semantic regions, leverages large language models (LLMs) for semantic evolution of prompts, and quantifies image novelty via CLIP embeddings to guide diverse generation. Integrated with FLUX-DEV and augmented by GPT-4o-mini for mutation assistance, WANDER achieves statistically significant improvements over existing evolutionary baselines across standard diversity metrics. Ablation studies confirm the launcher’s critical role in enhancing both semantic coverage and distributional breadth. This work establishes a new paradigm for controllable, interpretable, and highly diverse text-to-image synthesis.

Technology Category

Application Category

📝 Abstract

Text-to-image diffusion models, while proficient at generating high-fidelity im- ages, often suffer from limited output diversity, hindering their application in exploratory and ideation tasks. Existing prompt optimization techniques typically target aesthetic fitness or are ill-suited to the creative visual domain. To address this shortcoming, we introduce WANDER, a novelty search-based approach to generating diverse sets of images from a single input prompt. WANDER operates directly on natural language prompts, employing a Large Language Model (LLM) for semantic evolution of diverse sets of images, and using CLIP embeddings to quantify novelty. We additionally apply emitters to guide the search into distinct regions of the prompt space, and demonstrate that they boost the diversity of the generated images. Empirical evaluations using FLUX-DEV for generation and GPT-4o-mini for mutation demonstrate that WANDER significantly outperforms existing evolutionary prompt optimization baselines in diversity metrics. Ablation studies confirm the efficacy of emitters.

Problem

Research questions and friction points this paper is trying to address.

Address limited diversity in text-to-image diffusion models

Generate diverse images from single prompts using novelty search

Enhance visual creativity through semantic evolution and CLIP embeddings

Innovation

Methods, ideas, or system contributions that make the work stand out.

Uses LLM for semantic evolution of prompts

Employs CLIP embeddings to quantify novelty

Applies emitters to guide search diversity

🔎 Similar Papers

No similar papers found.