AlphaFace: High Fidelity and Real-time Face Swapper Robust to Facial Pose

📅 2026-01-23

📈 Citations: 0

✨ Influential: 0

career value

205K/year

🤖 AI Summary

Existing face-swapping methods struggle to simultaneously achieve high fidelity, real-time performance, and pose robustness under extreme head poses. This work proposes a novel approach that, for the first time, integrates vision–language semantic contrastive learning into face swapping by leveraging CLIP’s image and text embeddings to construct a semantic contrastive loss. This strategy enhances identity representation and attribute preservation without requiring explicit geometric modeling. Combined with a lightweight generative architecture, the method significantly outperforms state-of-the-art techniques on the FF++, MPIE, and LPFF benchmarks, particularly excelling in large-pose scenarios while supporting real-time inference.

Technology Category

Application Category

📝 Abstract

Existing face-swapping methods often deliver competitive results in constrained settings but exhibit substantial quality degradation when handling extreme facial poses. To improve facial pose robustness, explicit geometric features are applied, but this approach remains problematic since it introduces additional dependencies and increases computational cost. Diffusion-based methods have achieved remarkable results; however, they are impractical for real-time processing. We introduce AlphaFace, which leverages an open-source vision-language model and CLIP image and text embeddings to apply novel visual and textual semantic contrastive losses. AlphaFace enables stronger identity representation and more precise attribute preservation, all while maintaining real-time performance. Comprehensive experiments across FF++, MPIE, and LPFF demonstrate that AlphaFace surpasses state-of-the-art methods in pose-challenging cases. The project is publicly available on `https://github.com/andrewyu90/Alphaface_Official.git'.

Problem

Research questions and friction points this paper is trying to address.

face swapping

facial pose robustness

real-time performance

identity preservation

attribute preservation

Innovation

Methods, ideas, or system contributions that make the work stand out.

face swapping

pose robustness

vision-language model