Inspiration Seeds: Learning Non-Literal Visual Combinations for Generative Exploration

📅 2026-02-09
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the limitations of current generative models, which rely heavily on precise textual prompts and thus struggle to support the open-ended and ambiguous visual exploration typical of early-stage creative ideation. The authors propose a prompt-free, feedforward generative framework that synthesizes semantically meaningful and visually coherent image combinations from only two input images. By eliminating dependence on language, the method constructs training data exclusively from visual triplets and leverages a CLIP-based sparse autoencoder to extract disentangled editing directions from the CLIP latent space, enabling non-literal recombination of visual concepts. This approach empowers designers to conduct intuitive and efficient visual exploration during the initial phases of the creative process, fostering inspiration without the constraints of explicit textual guidance.

Technology Category

Application Category

📝 Abstract
While generative models have become powerful tools for image synthesis, they are typically optimized for executing carefully crafted textual prompts, offering limited support for the open-ended visual exploration that often precedes idea formation. In contrast, designers frequently draw inspiration from loosely connected visual references, seeking emergent connections that spark new ideas. We propose Inspiration Seeds, a generative framework that shifts image generation from final execution to exploratory ideation. Given two input images, our model produces diverse, visually coherent compositions that reveal latent relationships between inputs, without relying on user-specified text prompts. Our approach is feed-forward, trained on synthetic triplets of decomposed visual aspects derived entirely through visual means: we use CLIP Sparse Autoencoders to extract editing directions in CLIP latent space and isolate concept pairs. By removing the reliance on language and enabling fast, intuitive recombination, our method supports visual ideation at the early and ambiguous stages of creative work.
Problem

Research questions and friction points this paper is trying to address.

generative exploration
visual ideation
non-literal visual combinations
open-ended visual exploration
image generation
Innovation

Methods, ideas, or system contributions that make the work stand out.

Inspiration Seeds
visual ideation
prompt-free generation
CLIP Sparse Autoencoders
non-literal visual combinations
🔎 Similar Papers
No similar papers found.