🤖 AI Summary
Virtual face synthesis struggles to simultaneously preserve identity consistency and achieve stylistic diversity. Method: This paper proposes a 3D Morphable Model (3DMM)-guided diffusion generative framework. It is the first to explicitly model subject-specific stylistic attributes—namely, shape, pose, and expression—and introduces a statistics-driven style sampling mechanism to jointly characterize intra-subject variation and inter-subject differences. Additionally, we design a context fusion strategy that integrates 3DMM-rendered geometric priors with identity features extracted from a pre-trained face recognition model, enabling strong conditional control. Results: Experiments demonstrate that images synthesized by our method significantly improve downstream face recognition performance across multiple benchmarks, surpassing state-of-the-art approaches. These results validate the effectiveness and generalizability of identity-faithful, style-controllable synthetic data for real-world face recognition tasks.
📝 Abstract
Identity-preserving face synthesis aims to generate synthetic face images of virtual subjects that can substitute real-world data for training face recognition models. While prior arts strive to create images with consistent identities and diverse styles, they face a trade-off between them. Identifying their limitation of treating style variation as subject-agnostic and observing that real-world persons actually have distinct, subject-specific styles, this paper introduces MorphFace, a diffusion-based face generator. The generator learns fine-grained facial styles, e.g., shape, pose and expression, from the renderings of a 3D morphable model (3DMM). It also learns identities from an off-the-shelf recognition model. To create virtual faces, the generator is conditioned on novel identities of unlabeled synthetic faces, and novel styles that are statistically sampled from a real-world prior distribution. The sampling especially accounts for both intra-subject variation and subject distinctiveness. A context blending strategy is employed to enhance the generator's responsiveness to identity and style conditions. Extensive experiments show that MorphFace outperforms the best prior arts in face recognition efficacy.