Do It Yourself (DIY): Modifying Images for Poems in a Zero-Shot Setting Using Weighted Prompt Manipulation

๐Ÿ“… 2025-09-15
๐Ÿ“ˆ Citations: 0
โœจ Influential: 0
๐Ÿ“„ PDF
๐Ÿค– AI Summary
This work addresses zero-shot poetry-to-image generation, tackling the challenge of dynamically customizing visual outputs according to usersโ€™ subjective poetic interpretations. We propose Weighted Prompt Manipulation (WPM), the first method to introduce this technique into poetry visualization: it enables fine-grained, tuning-free prompt editing by modulating cross-modal attention weights and semantic intensity of text embeddings within diffusion models. Integrating large language models (e.g., GPT) for semantic parsing with diffusion models for generative synthesis, WPM supports real-time user control over imagery, artistic style, and emotional toneโ€”while preserving the zero-shot setting. Extensive evaluation across multiple poetry benchmarks demonstrates significant improvements in image semantic richness, contextual coherence, and interpretability. Our approach establishes a novel, interactive, and personalized paradigm for visualizing poetic meaning.

Technology Category

Application Category

๐Ÿ“ Abstract
Poetry is an expressive form of art that invites multiple interpretations, as readers often bring their own emotions, experiences, and cultural backgrounds into their understanding of a poem. Recognizing this, we aim to generate images for poems and improve these images in a zero-shot setting, enabling audiences to modify images as per their requirements. To achieve this, we introduce a novel Weighted Prompt Manipulation (WPM) technique, which systematically modifies attention weights and text embeddings within diffusion models. By dynamically adjusting the importance of specific words, WPM enhances or suppresses their influence in the final generated image, leading to semantically richer and more contextually accurate visualizations. Our approach exploits diffusion models and large language models (LLMs) such as GPT in conjunction with existing poetry datasets, ensuring a comprehensive and structured methodology for improved image generation in the literary domain. To the best of our knowledge, this is the first attempt at integrating weighted prompt manipulation for enhancing imagery in poetic language.
Problem

Research questions and friction points this paper is trying to address.

Generate images for poems in zero-shot setting
Modify images using weighted prompt manipulation technique
Enhance semantic richness of poetic visualizations
Innovation

Methods, ideas, or system contributions that make the work stand out.

Weighted Prompt Manipulation technique modifies attention weights
Dynamically adjusts word importance in diffusion models
Integrates LLMs with poetry datasets for visualization
๐Ÿ”Ž Similar Papers
No similar papers found.
Sofia Jamil
Sofia Jamil
PhD Research Scholar
Large Language ModelNatural Language ProcessingText to Image Generation Models
K
Kotla Sai Charan
Department of Computer Science & Engineering, Indian Institute of Technology Patna, India
S
Sriparna Saha
Department of Computer Science & Engineering, Indian Institute of Technology Patna, India
Koustava Goswami
Koustava Goswami
Research Scientist 2 @ Adobe Research
Natural Language ProcessingLanguage ModelMultimodal Learning
K J Joseph
K J Joseph
Research Scientist, Adobe Research
Deep Learning