🤖 AI Summary
Existing methods for 4D facial expression generation exhibit limited robustness in cross-identity generalization. To address this issue, this work proposes LM-4DGAN, a novel framework that, for the first time, leverages neutral-face landmarks as a guiding signal for expression synthesis. The model integrates an identity discriminator and a landmark autoencoder to enhance identity invariance, while incorporating a cross-attention mechanism within the displacement decoder to enable personalized adaptation to specific identities. Built upon a Wasserstein GAN (WGAN) architecture, LM-4DGAN significantly improves both the robustness and expressiveness of cross-identity facial animation, outperforming current approaches that rely on label-based or speech-driven conditioning.
📝 Abstract
In this paper, we proposed a generative model that learns to synthesize the 4D facial expression with the neutral landmark. Existing works mainly focus on the generation of sequences guided by expression labels, speech, etc, while they are not robust to the change of different identities. Our LM-4DGAN utilizes neutral landmarks to guide the facial expression generation while adding an identity discriminator and a landmark autoencoder to the basic WGAN for achieving better identity robustness. Furthermore, we add a cross-attention mechanism to the existing displacement decoder which is suitable for the given identity.