PosterOmni: Generalized Artistic Poster Creation via Task Distillation and Unified Reward Feedback

📅 2026-02-12
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the dual challenge of semantic fidelity and aesthetic coherence in image-to-poster generation, which requires balancing localized editing with holistic composition. To this end, we propose the first unified generative paradigm that simultaneously preserves local entities and drives global conceptual design. Our approach integrates multi-task capabilities—including layout control, style guidance, and image prompting—through multi-scenario data distillation, expert knowledge transfer, and a unified reward alignment mechanism. We further introduce PosterOmni-Bench, a comprehensive benchmark for evaluating poster generation systems. Extensive experiments demonstrate that our method significantly outperforms both open-source and proprietary baselines in reference adherence, compositional quality, and aesthetic harmony.

Technology Category

Application Category

📝 Abstract
Image-to-poster generation is a high-demand task requiring not only local adjustments but also high-level design understanding. Models must generate text, layout, style, and visual elements while preserving semantic fidelity and aesthetic coherence. The process spans two regimes: local editing, where ID-driven generation, rescaling, filling, and extending must preserve concrete visual entities; and global creation, where layout- and style-driven tasks rely on understanding abstract design concepts. These intertwined demands make image-to-poster a multi-dimensional process coupling entity-preserving editing with concept-driven creation under image-prompt control. To address these challenges, we propose PosterOmni, a generalized artistic poster creation framework that unlocks the potential of a base edit model for multi-task image-to-poster generation. PosterOmni integrates the two regimes, namely local editing and global creation, within a single system through an efficient data-distillation-reward pipeline: (i) constructing multi-scenario image-to-poster datasets covering six task types across entity-based and concept-based creation; (ii) distilling knowledge between local and global experts for supervised fine-tuning; and (iii) applying unified PosterOmni Reward Feedback to jointly align visual entity-preserving and aesthetic preference across all tasks. Additionally, we establish PosterOmni-Bench, a unified benchmark for evaluating both local editing and global creation. Extensive experiments show that PosterOmni significantly enhances reference adherence, global composition quality, and aesthetic harmony, outperforming all open-source baselines and even surpassing several proprietary systems.
Problem

Research questions and friction points this paper is trying to address.

image-to-poster generation
local editing
global creation
semantic fidelity
aesthetic coherence
Innovation

Methods, ideas, or system contributions that make the work stand out.

task distillation
unified reward feedback
image-to-poster generation
multi-task learning
aesthetic coherence