Gen-AFFECT: Generation of Avatar Fine-grained Facial Expressions with Consistent identiTy

📅 2025-08-12
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Existing 2D avatar generation methods struggle to simultaneously achieve fine-grained facial expression modeling and cross-expression identity consistency. To address this, we propose a personalized avatar generation framework based on a multimodal diffusion Transformer. Our key contributions are: (1) joint identity-expression representation learning, which disentangles and co-models identity and expression features; (2) a consistency-aware attention mechanism that enforces identity stability across expressions via shared attention weights and explicit inference-time constraints; and (3) end-to-end fine-grained expression synthesis. Evaluated on standard benchmarks, our method significantly outperforms state-of-the-art approaches in expression accuracy, identity preservation, and cross-expression consistency. The framework enables high-fidelity, photorealistic virtual interactions and content creation with robust identity coherence across diverse expressions.

Technology Category

Application Category

📝 Abstract
Different forms of customized 2D avatars are widely used in gaming applications, virtual communication, education, and content creation. However, existing approaches often fail to capture fine-grained facial expressions and struggle to preserve identity across different expressions. We propose GEN-AFFECT, a novel framework for personalized avatar generation that generates expressive and identity-consistent avatars with a diverse set of facial expressions. Our framework proposes conditioning a multimodal diffusion transformer on an extracted identity-expression representation. This enables identity preservation and representation of a wide range of facial expressions. GEN-AFFECT additionally employs consistent attention at inference for information sharing across the set of generated expressions, enabling the generation process to maintain identity consistency over the array of generated fine-grained expressions. GEN-AFFECT demonstrates superior performance compared to previous state-of-the-art methods on the basis of the accuracy of the generated expressions, the preservation of the identity and the consistency of the target identity across an array of fine-grained facial expressions.
Problem

Research questions and friction points this paper is trying to address.

Generating fine-grained facial expressions for avatars
Preserving identity consistency across diverse expressions
Improving avatar personalization in virtual applications
Innovation

Methods, ideas, or system contributions that make the work stand out.

Multimodal diffusion transformer for avatar generation
Identity-expression representation for consistency
Consistent attention mechanism during inference
🔎 Similar Papers
No similar papers found.