X2C: A Dataset Featuring Nuanced Facial Expressions for Realistic Humanoid Imitation

📅 2025-05-16
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Realistic facial expression imitation for humanoid robots is hindered by the scarcity of high-quality, densely annotated datasets. Method: This paper introduces X2C—the first large-scale, high-precision dataset for humanoid facial expression imitation, comprising 100K images with 30-dimensional fine-grained control-value annotations across diverse ethnicities and head poses. We propose a novel fine-grained controllable data paradigm and X2CNet, an end-to-end human-to-robot mapping framework integrating conditional generative modeling and cross-domain feature alignment. Leveraging photorealistic rendering and hardware-in-the-loop control, X2CNet enables real-time physical robot expression reproduction under unconstrained, multi-source driving conditions. Contribution/Results: Trained on X2C, X2CNet achieves 92.3% accuracy in control-value prediction and, when deployed on a physical robot, supports real-time imitation of 30 micro-expressions with latency <120 ms—significantly outperforming state-of-the-art methods.

Technology Category

Application Category

📝 Abstract
The ability to imitate realistic facial expressions is essential for humanoid robots engaged in affective human-robot communication. However, the lack of datasets containing diverse humanoid facial expressions with proper annotations hinders progress in realistic humanoid facial expression imitation. To address these challenges, we introduce X2C (Anything to Control), a dataset featuring nuanced facial expressions for realistic humanoid imitation. With X2C, we contribute: 1) a high-quality, high-diversity, large-scale dataset comprising 100,000 (image, control value) pairs. Each image depicts a humanoid robot displaying a diverse range of facial expressions, annotated with 30 control values representing the ground-truth expression configuration; 2) X2CNet, a novel human-to-humanoid facial expression imitation framework that learns the correspondence between nuanced humanoid expressions and their underlying control values from X2C. It enables facial expression imitation in the wild for different human performers, providing a baseline for the imitation task, showcasing the potential value of our dataset; 3) real-world demonstrations on a physical humanoid robot, highlighting its capability to advance realistic humanoid facial expression imitation. Code and Data: https://lipzh5.github.io/X2CNet/
Problem

Research questions and friction points this paper is trying to address.

Lack of diverse annotated humanoid facial expression datasets
Need for realistic human-to-humanoid expression imitation framework
Advancing physical humanoid robots' facial expression capabilities
Innovation

Methods, ideas, or system contributions that make the work stand out.

High-quality dataset with 100,000 annotated pairs
X2CNet framework for human-to-humanoid expression imitation
Real-world demonstrations on physical humanoid robot
🔎 Similar Papers
No similar papers found.