🤖 AI Summary
Existing unsupervised keypoint-based methods struggle to disentangle identity semantics from motion information—such as pose and expression—limiting the controllability of facial animation. This work proposes a self-supervised representation learning framework that explicitly decouples identity, rigid motion, and expression in the latent space, and introduces a novel keypoint representation to enable arbitrary motion control. By incorporating a variational autoencoder to map expressions into a continuous Gaussian distribution, the method achieves, for the first time under an unsupervised setting, smooth and continuous expression interpolation. Experiments on public benchmarks demonstrate that the proposed approach significantly outperforms existing techniques, generating facial animations that exhibit both high photorealism and fine-grained controllability.
📝 Abstract
Face animation deals with controlling and generating facial features with a wide range of applications. The methods based on unsupervised keypoint positioning can produce realistic and detailed virtual portraits. However, they cannot achieve controllable face generation since the existing keypoint decomposition pipelines fail to fully decouple identity semantics and intertwined motion information (e.g., rotation, translation, and expression). To address these issues, we present a new method, Motion Manipulation via unsupervised keypoint positioning in Face Animation (MMFA). We first introduce self-supervised representation learning to encode and decode expressions in the latent feature space and decouple them from other motion information. Secondly, we propose a new way to compute keypoints aiming to achieve arbitrary motion control. Moreover, we design a variational autoencoder to map expression features to a continuous Gaussian distribution, allowing us for the first time to interpolate facial expressions in an unsupervised framework. We have conducted extensive experiments on publicly available datasets to validate the effectiveness of MMFA, which show that MMFA offers pronounced advantages over prior arts in creating realistic animation and manipulating face motion.