PEO: Training-Free Aesthetic Quality Enhancement in Pre-Trained Text-to-Image Diffusion Models with Prompt Embedding Optimization

📅 2025-10-02

📈 Citations: 0

✨ Influential: 0

career value

186K/year

📝 Abstract

This paper introduces a novel approach to aesthetic quality improvement in pre-trained text-to-image diffusion models when given a simple prompt. Our method, dubbed Prompt Embedding Optimization (PEO), leverages a pre-trained text-to-image diffusion model as a backbone and optimizes the text embedding of a given simple and uncurated prompt to enhance the visual quality of the generated image. We achieve this by a tripartite objective function that improves the aesthetic fidelity of the generated image, ensures adherence to the optimized text embedding, and minimal divergence from the initial prompt. The latter is accomplished through a prompt preservation term. Additionally, PEO is training-free and backbone-independent. Quantitative and qualitative evaluations confirm the effectiveness of the proposed method, exceeding or equating the performance of state-of-the-art text-to-image and prompt adaptation methods.

Problem

Research questions and friction points this paper is trying to address.

Enhances aesthetic quality in text-to-image models without retraining

Optimizes text embeddings to improve generated image visual fidelity

Preserves original prompt meaning while boosting artistic appeal

Innovation

Methods, ideas, or system contributions that make the work stand out.

Optimizes text embeddings for aesthetic enhancement

Uses tripartite objective for fidelity and adherence

Training-free and backbone-independent approach

🔎 Similar Papers

DiffArtist: Towards Structure and Appearance Controllable Image Stylization