PosterCraft: Rethinking High-Quality Aesthetic Poster Generation in a Unified Framework

📅 2025-06-12

📈 Citations: 0

✨ Influential: 0

career value

198K/year

🤖 AI Summary

This work addresses key challenges in high-quality creative poster generation—namely, text rendering distortion, disconnection between artistic content and layout, and stylistic inconsistency—by proposing the first end-to-end unified generation framework. Methodologically, it introduces: (1) a four-stage cascaded optimization pipeline—text rendering refinement, region-aware fine-tuning, aesthetics-text reinforcement learning, and joint vision-language feedback refinement; (2) an automated data synthesis pipeline leveraging Text-Render-2M and HQ-Poster100K for multi-stage training; and (3) best-of-n preference optimization with region-aware supervision. Experiments demonstrate that the method significantly outperforms leading open-source baselines in text fidelity, layout coherence, and overall aesthetic quality, approaching the performance of commercial state-of-the-art systems.

Technology Category

Application Category

📝 Abstract

Generating aesthetic posters is more challenging than simple design images: it requires not only precise text rendering but also the seamless integration of abstract artistic content, striking layouts, and overall stylistic harmony. To address this, we propose PosterCraft, a unified framework that abandons prior modular pipelines and rigid, predefined layouts, allowing the model to freely explore coherent, visually compelling compositions. PosterCraft employs a carefully designed, cascaded workflow to optimize the generation of high-aesthetic posters: (i) large-scale text-rendering optimization on our newly introduced Text-Render-2M dataset; (ii) region-aware supervised fine-tuning on HQ-Poster100K; (iii) aesthetic-text-reinforcement learning via best-of-n preference optimization; and (iv) joint vision-language feedback refinement. Each stage is supported by a fully automated data-construction pipeline tailored to its specific needs, enabling robust training without complex architectural modifications. Evaluated on multiple experiments, PosterCraft significantly outperforms open-source baselines in rendering accuracy, layout coherence, and overall visual appeal-approaching the quality of SOTA commercial systems. Our code, models, and datasets can be found in the Project page: https://ephemeral182.github.io/PosterCraft

Problem

Research questions and friction points this paper is trying to address.

Generating aesthetic posters with seamless artistic integration

Overcoming rigid layouts for coherent visual compositions

Optimizing text rendering and layout in poster design

Innovation

Methods, ideas, or system contributions that make the work stand out.

Unified framework for aesthetic poster generation

Cascaded workflow with four optimization stages

Automated data-construction pipeline for robust training

🔎 Similar Papers

GlyphDraw2: Automatic Generation of Complex Glyph Posters with Diffusion Models and Large Language Models