Pose and Facial Expression Transfer by using StyleGAN

📅 2025-04-17

📈 Citations: 0

✨ Influential: 0

career value

218K/year

🤖 AI Summary

This paper addresses cross-identity facial pose and expression transfer. We propose a self-supervised method operating in the StyleGAN2 latent space, employing a dual-encoder–mapping architecture: a source-image encoder extracts pose and expression representations, while a target-image encoder captures identity features; these are fused via a latent-space mapping network to drive the StyleGAN2 generator for reconstructing the target identity under novel pose and expression. Training is fully unsupervised, leveraging only inter-frame consistency from unlabeled video sequences. Our key contributions are: (1) the first disentanglement-aware dual-path latent mapping mechanism for controllable editing; (2) arbitrary-identity, fine-grained pose/expression transfer with explicit control; and (3) high-fidelity synthesis at near-real-time inference speed. Extensive experiments demonstrate superior qualitative and quantitative performance over existing unsupervised approaches.

Technology Category

Application Category

📝 Abstract

We propose a method to transfer pose and expression between face images. Given a source and target face portrait, the model produces an output image in which the pose and expression of the source face image are transferred onto the target identity. The architecture consists of two encoders and a mapping network that projects the two inputs into the latent space of StyleGAN2, which finally generates the output. The training is self-supervised from video sequences of many individuals. Manual labeling is not required. Our model enables the synthesis of random identities with controllable pose and expression. Close-to-real-time performance is achieved.

Problem

Research questions and friction points this paper is trying to address.

Transfer pose and expression between face images

Synthesize random identities with controllable attributes

Achieve close-to-real-time performance without manual labeling

Innovation

Methods, ideas, or system contributions that make the work stand out.

Uses StyleGAN2 for pose and expression transfer

Self-supervised training from video sequences

Achieves close-to-real-time performance

🔎 Similar Papers

DiffusionAct: Controllable Diffusion Autoencoder for One-shot Face Reenactment

2024-03-25arXiv.orgCitations: 6