🤖 AI Summary
This work proposes the first end-to-end, commercial-grade framework for automatic Chinese poster generation, addressing limitations in design coherence, text rendering accuracy, and business adaptability that hinder existing methods from balancing high-density information with visual appeal. The framework comprises a three-stage pipeline: blueprint generation, background synthesis, and unified HTML-driven layout and text rendering. Key innovations include PosterDNA—the first HTML-formatted dataset of Chinese commercial posters—fine-tuned large language models (LLMs) for design element extraction, a tailored diffusion model for background generation, and a multimodal LLM (MLLM)-powered, extensible HTML rendering engine that effectively resolves challenges in rendering small-font, high-density text. Experiments demonstrate that the generated posters meet commercial standards in visual attractiveness, text alignment precision, and layout customizability, significantly outperforming current approaches.
📝 Abstract
Commercial-grade poster design demands the seamless integration of aesthetic appeal with precise, informative content delivery. Current automated poster generation systems face significant limitations, including incomplete design workflows, poor text rendering accuracy, and insufficient flexibility for commercial applications. To address these challenges, we propose PosterVerse, a full-workflow, commercial-grade poster generation method that seamlessly automates the entire design process while delivering high-density and scalable text rendering. PosterVerse replicates professional design through three key stages: (1) blueprint creation using fine-tuned LLMs to extract key design elements from user requirements, (2) graphical background generation via customized diffusion models to create visually appealing imagery, and (3) unified layout-text rendering with an MLLM-powered HTML engine to guarantee high text accuracy and flexible customization. In addition, we introduce PosterDNA, a commercial-grade, HTML-based dataset tailored for training and validating poster design models. To the best of our knowledge, PosterDNA is the first Chinese poster generation dataset to introduce HTML typography files, enabling scalable text rendering and fundamentally solving the challenges of rendering small and high-density text. Experimental results demonstrate that PosterVerse consistently produces commercial-grade posters with appealing visuals, accurate text alignment, and customizable layouts, making it a promising solution for automating commercial poster design. The code and model are available at https://github.com/wuhaer/PosterVerse.