The Intricate Dance of Prompt Complexity, Quality, Diversity, and Consistency in T2I Models

📅 2025-10-22
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work systematically investigates how prompt complexity affects the quality, diversity, and consistency of synthetic data generated by text-to-image (T2I) models. To address the lack of holistic evaluation, we propose the first multi-dimensional framework for assessing synthetic data utility, revealing an intrinsic trade-off: while increased prompt complexity reduces conditional controllability, it substantially narrows the distributional shift between synthetic and real data. Methodologically, we introduce a novel likelihood estimation mechanism grounded in pre-trained language models, integrated with diffusion-based generation and inference-time prompt expansion. Evaluated on CC12M, ImageNet-1K, and DCI benchmarks, our approach achieves superior generative diversity and aesthetic quality compared to real data. The study provides both theoretical insights and empirical evidence on the role of prompt engineering in synthetic data construction, advancing principled design of T2I prompting strategies for downstream data-centric applications.

Technology Category

Application Category

📝 Abstract
Text-to-image (T2I) models offer great potential for creating virtually limitless synthetic data, a valuable resource compared to fixed and finite real datasets. Previous works evaluate the utility of synthetic data from T2I models on three key desiderata: quality, diversity, and consistency. While prompt engineering is the primary means of interacting with T2I models, the systematic impact of prompt complexity on these critical utility axes remains underexplored. In this paper, we first conduct synthetic experiments to motivate the difficulty of generalization w.r.t. prompt complexity and explain the observed difficulty with theoretical derivations. Then, we introduce a new evaluation framework that can compare the utility of real data and synthetic data, and present a comprehensive analysis of how prompt complexity influences the utility of synthetic data generated by commonly used T2I models. We conduct our study across diverse datasets, including CC12M, ImageNet-1k, and DCI, and evaluate different inference-time intervention methods. Our synthetic experiments show that generalizing to more general conditions is harder than the other way round, since the former needs an estimated likelihood that is not learned by diffusion models. Our large-scale empirical experiments reveal that increasing prompt complexity results in lower conditional diversity and prompt consistency, while reducing the synthetic-to-real distribution shift, which aligns with the synthetic experiments. Moreover, current inference-time interventions can augment the diversity of the generations at the expense of moving outside the support of real data. Among those interventions, prompt expansion, by deliberately using a pre-trained language model as a likelihood estimator, consistently achieves the highest performance in both image diversity and aesthetics, even higher than that of real data.
Problem

Research questions and friction points this paper is trying to address.

Investigating how prompt complexity affects synthetic data quality, diversity, and consistency
Developing evaluation framework to compare real versus synthetic data utility
Analyzing inference-time interventions' impact on distribution shifts and diversity
Innovation

Methods, ideas, or system contributions that make the work stand out.

Introduces a framework comparing real and synthetic data utility
Analyzes prompt complexity impact on synthetic data quality
Uses pre-trained language model for prompt expansion optimization
🔎 Similar Papers
No similar papers found.
X
Xiaofeng Zhang
FAIR at Meta - Montreal, Mila - Quebec AI Institute, Université de Montréal
Aaron Courville
Aaron Courville
Professor, DIRO, Université de Montréal, Mila, Cifar CAI chair
Machine learningArtificial Intelligence
M
Michal Drozdzal
FAIR at Meta - Montreal
Adriana Romero-Soriano
Adriana Romero-Soriano
Fundamental AI Research, Meta
deep learningmachine learningAI