Morpheus: A Neural-driven Animatronic Face with Hybrid Actuation and Diverse Emotion Control

📅 2025-07-22
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Existing anthropomorphic facial devices face dual bottlenecks: (1) hardware trade-offs between compactness and expression fidelity—rigid actuators offer high precision but occupy substantial volume, while tendon-driven mechanisms are space-efficient yet challenging to control; and (2) absence of end-to-end systems mapping speech directly to fine-grained emotional expressions. This paper proposes a hybrid rigid–tendon-driven mechatronic facial system: rigid actuators target high-precision regions (e.g., periorbital and oral areas), whereas tendon-driven actuation is deployed in micro-expression zones (e.g., forehead wrinkles, zygomatic musculature) to maximize spatial efficiency. We further introduce a self-modeling neural network that jointly predicts blendshape coefficients and motor control signals directly from raw speech input. Experiments demonstrate real-time generation of nuanced emotional animations—including joy, fear, disgust, and anger—with high temporal and spatial fidelity. Both the hardware design and source code are publicly released.

Technology Category

Application Category

📝 Abstract
Previous animatronic faces struggle to express emotions effectively due to hardware and software limitations. On the hardware side, earlier approaches either use rigid-driven mechanisms, which provide precise control but are difficult to design within constrained spaces, or tendon-driven mechanisms, which are more space-efficient but challenging to control. In contrast, we propose a hybrid actuation approach that combines the best of both worlds. The eyes and mouth-key areas for emotional expression-are controlled using rigid mechanisms for precise movement, while the nose and cheek, which convey subtle facial microexpressions, are driven by strings. This design allows us to build a compact yet versatile hardware platform capable of expressing a wide range of emotions. On the algorithmic side, our method introduces a self-modeling network that maps motor actions to facial landmarks, allowing us to automatically establish the relationship between blendshape coefficients for different facial expressions and the corresponding motor control signals through gradient backpropagation. We then train a neural network to map speech input to corresponding blendshape controls. With our method, we can generate distinct emotional expressions such as happiness, fear, disgust, and anger, from any given sentence, each with nuanced, emotion-specific control signals-a feature that has not been demonstrated in earlier systems. We release the hardware design and code at https://github.com/ZZongzheng0918/Morpheus-Hardware and https://github.com/ZZongzheng0918/Morpheus-Software.
Problem

Research questions and friction points this paper is trying to address.

Hybrid actuation for precise and space-efficient facial movement
Self-modeling network mapping motor actions to facial landmarks
Neural network generating emotion-specific expressions from speech
Innovation

Methods, ideas, or system contributions that make the work stand out.

Hybrid rigid and tendon-driven actuation system
Self-modeling network for motor-facial landmark mapping
Neural network for speech-to-expression control
🔎 Similar Papers
No similar papers found.
Z
Zongzheng Zhang
Institute for AI Industry Research (AIR), Tsinghua University
Jiawen Yang
Jiawen Yang
Institute for AI Industry Research (AIR), Tsinghua University
Ziqiao Peng
Ziqiao Peng
Renmin University of China
3D Face AnimationTalking Head Generation
M
Meng Yang
MGI Tech, Shenzhen, China
Jianzhu Ma
Jianzhu Ma
Tsinghua University
Machine LearningComputational BiologyBioinformatics
L
Lin Cheng
Beihang University
Huazhe Xu
Huazhe Xu
Tsinghua University
Embodied AIReinforcement LearningComputer VisionDeep Learning
H
Hang Zhao
Institute for Interdisciplinary Information Sciences (IIIS), Tsinghua University
H
Hao Zhao
Institute for AI Industry Research (AIR), Tsinghua University