SerialGen: Personalized Image Generation by First Standardization Then Personalization

📅 2024-12-02

🏛️ arXiv.org

📈 Citations: 1

✨ Influential: 1

career value

175K/year

🤖 AI Summary

This work addresses the challenge of jointly achieving *text controllability* and *full-body appearance consistency* in personalized portrait generation. We propose a two-stage diffusion-based serial framework: Stage I employs a reference-image standardization module to decouple pose and shape while normalizing appearance; Stage II introduces an appearance alignment mechanism and text-condition fusion to enable high-fidelity, text-driven personalized generation. We pioneer the “standardization → personalization” paradigm, designing two synergistic modules to enhance standardization accuracy and enabling consistent multi-prompt sequence generation. Experiments demonstrate that our method significantly outperforms state-of-the-art approaches in full-body appearance fidelity and cross-prompt consistency, achieving leading performance across multiple benchmarks.

Technology Category

Application Category

📝 Abstract

In this work, we are interested in achieving both high text controllability and whole-body appearance consistency in the generation of personalized human characters. We propose a novel framework, named SerialGen, which is a serial generation method consisting of two stages: first, a standardization stage that standardizes reference images, and then a personalized generation stage based on the standardized reference. Furthermore, we introduce two modules aimed at enhancing the standardization process. Our experimental results validate the proposed framework's ability to produce personalized images that faithfully recover the reference image's whole-body appearance while accurately responding to a wide range of text prompts. Through thorough analysis, we highlight the critical contribution of the proposed serial generation method and standardization model, evidencing enhancements in appearance consistency between reference and output images and across serial outputs generated from diverse text prompts. The term"Serial"in this work carries a double meaning: it refers to the two-stage method and also underlines our ability to generate serial images with consistent appearance throughout.

Problem

Research questions and friction points this paper is trying to address.

Achieving high text controllability in personalized human character generation

Ensuring whole-body appearance consistency in generated images

Standardizing reference images before personalized generation

Innovation

Methods, ideas, or system contributions that make the work stand out.

Two-stage serial generation method

Standardization enhances reference images

Personalized generation maintains appearance consistency

🔎 Similar Papers

No similar papers found.