🤖 AI Summary
This work addresses the limitations of current models in effectively modeling core aspects of poster design—such as visual hierarchy, typographic semantics, and compositional control—in both understanding and generation tasks. To this end, we introduce PosterIQ, the first design-driven benchmark for poster understanding and generation, featuring fine-grained annotations of compositional structure, typographic hierarchy, and semantic intent, along with defined tasks including layout parsing, text-image alignment, readability assessment, and metaphor-guided generation. Leveraging multimodal large language models and diffusion-based generative models, we conduct a systematic evaluation using professional design annotations and controllable prompts. Our experiments reveal significant deficiencies in existing models’ perception of visual hierarchy and compositional awareness: commercial models excel at high-level reasoning but lack scoring sensitivity, while generative models render text accurately yet fail to grasp holistic layout structure. PosterIQ establishes a reproducible evaluation framework and diagnostic metrics for intelligent poster design.
📝 Abstract
We present PosterIQ, a design-driven benchmark for poster understanding and generation, annotated across composition structure, typographic hierarchy, and semantic intent. It includes 7,765 image-annotation instances and 822 generation prompts spanning real, professional, and synthetic cases. To bridge visual design cognition and generative modeling, we define tasks for layout parsing, text-image correspondence, typography/readability and font perception, design quality assessment, and controllable, composition-aware generation with metaphor. We evaluate state-of-the-art MLLMs and diffusion-based generators, finding persistent gaps in visual hierarchy, typographic semantics, saliency control, and intention communication; commercial models lead on high-level reasoning but act as insensitive automatic raters, while generators render text well yet struggle with composition-aware synthesis. Extensive analyses show PosterIQ is both a quantitative benchmark and a diagnostic tool for design reasoning, offering reproducible, task-specific metrics. We aim to catalyze models' creativity and integrate human-centred design principles into generative vision-language systems.