Neural Face Skinning for Mesh‐agnostic Facial Expression Cloning

📅 2025-04-10

🏛️ Computer graphics forum (Print)

📈 Citations: 0

✨ Influential: 0

career value

211K/year

🤖 AI Summary

Expression retargeting across structurally diverse face meshes—achieving high-fidelity expression cloning and fine-grained controllable editing without mesh remeshing or pre-alignment—remains challenging. Method: We propose a global latent-code-based framework for localized skin weight prediction. Our approach innovatively decouples global semantic representations into per-vertex skin weights and introduces a joint learning scheme integrating Facial Action Coding System (FACS)-guided semantic supervision, neural skin weight regression, and indirect segmentation label guidance. Contribution/Results: The method supports arbitrary-topology input meshes and outperforms state-of-the-art methods in expression fidelity, deformation transfer accuracy, and cross-mesh generalization. It enables real-time editing on unseen face geometries while simultaneously supporting holistic expression control and localized geometric detail recovery.

Technology Category

Application Category

📝 Abstract

Accurately retargeting facial expressions to a face mesh while enabling manipulation is a key challenge in facial animation retargeting. Recent deep‐learning methods address this by encoding facial expressions into a global latent code, but they often fail to capture fine‐grained details in local regions. While some methods improve local accuracy by transferring deformations locally, this often complicates overall control of the facial expression. To address this, we propose a method that combines the strengths of both global and local deformation models. Our approach enables intuitive control and detailed expression cloning across diverse face meshes, regardless of their underlying structures. The core idea is to localize the influence of the global latent code on the target mesh. Our model learns to predict skinning weights for each vertex of the target face mesh through indirect supervision from predefined segmentation labels. These predicted weights localize the global latent code, enabling precise and region‐specific deformations even for meshes with unseen shapes. We supervise the latent code using Facial Action Coding System (FACS)‐based blendshapes to ensure interpretability and allow straightforward editing of the generated animation. Through extensive experiments, we demonstrate improved performance over state‐of‐the‐art methods in terms of expression fidelity, deformation transfer accuracy, and adaptability across diverse mesh structures.

Problem

Research questions and friction points this paper is trying to address.

Retarget facial expressions accurately to diverse face meshes

Combine global and local deformation for detailed expression cloning

Enable interpretable control using FACS-based blendshapes

Innovation

Methods, ideas, or system contributions that make the work stand out.

Combines global and local deformation models

Localizes global latent code influence

Uses FACS-based blendshapes for interpretability

🔎 Similar Papers

3D Facial Expressions through Analysis-by-Neural-Synthesis

2024-04-05Computer Vision and Pattern RecognitionCitations: 16

Apple

Sunnyvale, United States of America

Authors to Follow