AesthetiQ: Enhancing Graphic Layout Design via Aesthetic-Aware Preference Alignment of Multi-modal Large Language Models

📅 2025-03-01
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Existing generative models treat graphical layout design as a pure prediction task, neglecting aesthetic consistency of rendered outputs and thus deviating significantly from human aesthetic preferences. To address this, we propose the Aesthetic-Aware Preference Alignment (AAPA) framework—the first to integrate Direct Preference Optimization (DPO) into layout generation—enabling end-to-end, aesthetics-driven modeling via multimodal large language models (MLLMs). Our key contributions are: (1) a layout aesthetic preference alignment training paradigm; (2) a layout quality–guided data filtering protocol; and (3) a novel MLLM-based aesthetic win-rate metric for preference evaluation. AAPA generalizes across parameter scales (1B–8B) and architectures (Qwen, Phi, InternLM), achieving 17% and 16% improvements on the Crello and WebUI benchmarks, respectively—surpassing state-of-the-art methods. Extensive quantitative and qualitative analyses validate its effectiveness.

Technology Category

Application Category

📝 Abstract
Visual layouts are essential in graphic design fields such as advertising, posters, and web interfaces. The application of generative models for content-aware layout generation has recently gained traction. However, these models fail to understand the contextual aesthetic requirements of layout design and do not align with human-like preferences, primarily treating it as a prediction task without considering the final rendered output. To overcome these problems, we offer Aesthetic-Aware Preference Alignment(AAPA), a novel technique to train a Multi-modal Large Language Model (MLLM) for layout prediction that uses MLLM's aesthetic preferences for Direct Preference Optimization over graphic layouts. We propose a data filtering protocol utilizing our layout-quality heuristics for AAPA to ensure training happens on high-quality layouts. Additionally, we introduce a novel evaluation metric that uses another MLLM to compute the win rate of the generated layout against the ground-truth layout based on aesthetics criteria. We also demonstrate the applicability of AAPA for MLLMs of varying scales (1B to 8B parameters) and LLM families (Qwen, Phi, InternLM). By conducting thorough qualitative and quantitative analyses, we verify the efficacy of our approach on two challenging benchmarks - Crello and Webui, showcasing 17%, and 16 improvement over current State-of-The-Art methods, thereby highlighting the potential of MLLMs in aesthetic-aware layout generation.
Problem

Research questions and friction points this paper is trying to address.

Enhance graphic layout design using aesthetic-aware multi-modal models.
Align layout generation with human-like aesthetic preferences.
Improve layout quality via direct preference optimization techniques.
Innovation

Methods, ideas, or system contributions that make the work stand out.

Aesthetic-Aware Preference Alignment for MLLMs
Data filtering with layout-quality heuristics
Novel MLLM-based evaluation metric for aesthetics
🔎 Similar Papers
No similar papers found.