DynamicID: Zero-Shot Multi-ID Image Personalization with Flexible Facial Editability

📅 2025-03-09
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Current text-to-image generation methods suffer from low identity fidelity, poor cross-identity generalization, and reliance on multiple subject-specific training samples—particularly in multi-identity personalization and fine-grained facial editing. To address these limitations, we propose a zero-shot multi-identity image personalization framework. First, we design a Semantic Activation Attention (SAA) mechanism that enables dynamic multi-identity injection without requiring multiple ID-specific training samples. Second, we introduce an Identity-Motion Reconstructor (IMR), which disentangles and recombines identity and facial motion features via contrastive learning. Third, we adopt a two-stage, fine-tuning-free paradigm incorporating query-level activation gating, and release VariFace-10k—a large-scale dataset comprising 10,000 identities × 35 poses. Extensive experiments demonstrate state-of-the-art performance across identity fidelity, facial editability, and multi-identity generalization.

Technology Category

Application Category

📝 Abstract
Recent advancements in text-to-image generation have spurred interest in personalized human image generation, which aims to create novel images featuring specific human identities as reference images indicate. Although existing methods achieve high-fidelity identity preservation, they often struggle with limited multi-ID usability and inadequate facial editability. We present DynamicID, a tuning-free framework supported by a dual-stage training paradigm that inherently facilitates both single-ID and multi-ID personalized generation with high fidelity and flexible facial editability. Our key innovations include: 1) Semantic-Activated Attention (SAA), which employs query-level activation gating to minimize disruption to the original model when injecting ID features and achieve multi-ID personalization without requiring multi-ID samples during training. 2) Identity-Motion Reconfigurator (IMR), which leverages contrastive learning to effectively disentangle and re-entangle facial motion and identity features, thereby enabling flexible facial editing. Additionally, we have developed a curated VariFace-10k facial dataset, comprising 10k unique individuals, each represented by 35 distinct facial images. Experimental results demonstrate that DynamicID outperforms state-of-the-art methods in identity fidelity, facial editability, and multi-ID personalization capability.
Problem

Research questions and friction points this paper is trying to address.

Enhances multi-ID usability in personalized human image generation.
Improves facial editability while maintaining high identity fidelity.
Introduces a tuning-free framework for flexible single and multi-ID generation.
Innovation

Methods, ideas, or system contributions that make the work stand out.

Semantic-Activated Attention for multi-ID personalization
Identity-Motion Reconfigurator for flexible facial editing
Dual-stage training paradigm for high-fidelity generation
🔎 Similar Papers
No similar papers found.
Xirui Hu
Xirui Hu
Xi’an Jiaotong University
diffusion modelsAIGC
J
Jiahao Wang
School of Computer Science and Technology, Xi’an Jiaotong University
H
Hao Chen
AI Lab, Western Movie Group
Weizhan Zhang
Weizhan Zhang
Professor,Department of Computer Science and Technology, Xi'an Jiaotong University
Multimedia networking
B
Benqi Wang
AI Lab, Western Movie Group
Yikun Li
Yikun Li
Postdoctoral Researcher
Artificial intelligenceSoftware EngineeringCyber Security
H
Haishun Nan
AI Lab, Western Movie Group