IMUFace: Real-Time, Low-Power, Continuous 3D Facial Reconstruction Through Earphones

📅 2025-01-04
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Existing facial expression reconstruction methods suffer from limited environmental robustness, privacy concerns, and high power consumption. To address these challenges, this paper proposes a covert 3D facial expression reconstruction system based on ear-worn inertial measurement units (IMUs), which infers facial muscle activity from subtle ear motions—eliminating the need for cameras and ensuring privacy and wearing comfort. Our key contributions are: (1) the first high-accuracy decoding paradigm mapping ear-worn IMU signals to facial motion; (2) IMUTwinTrans, a lightweight transformer-based model integrating temporal modeling and twin attention mechanisms; and (3) support for 5-minute personalized calibration, 30-Hz on-device real-time reconstruction, and ultra-low power consumption of only 58 mW. Evaluated in a 12-subject user study, the system achieves a mean landmark error of 2.21 mm and successfully drives low-latency 3D facial animation, demonstrating feasibility for embedded deployment.

Technology Category

Application Category

📝 Abstract
The potential of facial expression reconstruction technology is significant, with applications in various fields such as human-computer interaction, affective computing, and virtual reality. Recent studies have proposed using ear-worn devices for facial expression reconstruction to address the environmental limitations and privacy concerns associated with traditional camera-based methods. However, these approaches still require improvements in terms of aesthetics and power consumption. This paper introduces a system called IMUFace. It uses inertial measurement units (IMUs) embedded in wireless earphones to detect subtle ear movements caused by facial muscle activities, allowing for covert and low-power facial reconstruction. A user study involving 12 participants was conducted, and a deep learning model named IMUTwinTrans was proposed. The results show that IMUFace can accurately predict users' facial landmarks with a precision of 2.21 mm, using only five minutes of training data. The predicted landmarks can be utilized to reconstruct a three-dimensional facial model. IMUFace operates at a sampling rate of 30 Hz with a relatively low power consumption of 58 mW. The findings presented in this study demonstrate the real-world applicability of IMUFace and highlight potential directions for further research to facilitate its practical adoption.
Problem

Research questions and friction points this paper is trying to address.

Facial Expression Reconstruction
Low Power Consumption
Privacy Protection
Innovation

Methods, ideas, or system contributions that make the work stand out.

IMUFace system
real-time facial expression reconstruction
low-power consumption
🔎 Similar Papers
No similar papers found.
X
Xianrong Yao
School of Future Technology, South China University of Technology, Guangzhou, Guangdong 510640, China
C
Chengzhang Yu
School of Future Technology, South China University of Technology, Guangzhou, Guangdong 510640, China
L
Lingde Hu
School of Future Technology, South China University of Technology, Guangzhou, Guangdong 510640, China
Yincheng Jin
Yincheng Jin
Binghamton University
Ubiquitous ComputingHCIMachine Learning
Y
Yang Gao
School of Computer Science and Technology, East China Normal University, Shanghai, 200062, China
Zhanpeng Jin
Zhanpeng Jin
Xinshi Endowed Professor, South China University of Technology
Human-centered computingubiquitous computinghuman-computer interactionsmart health