IMUFace: Real-Time, Low-Power, Continuous 3D Facial Reconstruction Through Earphones

📅 2025-01-04

📈 Citations: 0

✨ Influential: 0

career value

203K/year

🤖 AI Summary

Existing facial expression reconstruction methods suffer from limited environmental robustness, privacy concerns, and high power consumption. To address these challenges, this paper proposes a covert 3D facial expression reconstruction system based on ear-worn inertial measurement units (IMUs), which infers facial muscle activity from subtle ear motions—eliminating the need for cameras and ensuring privacy and wearing comfort. Our key contributions are: (1) the first high-accuracy decoding paradigm mapping ear-worn IMU signals to facial motion; (2) IMUTwinTrans, a lightweight transformer-based model integrating temporal modeling and twin attention mechanisms; and (3) support for 5-minute personalized calibration, 30-Hz on-device real-time reconstruction, and ultra-low power consumption of only 58 mW. Evaluated in a 12-subject user study, the system achieves a mean landmark error of 2.21 mm and successfully drives low-latency 3D facial animation, demonstrating feasibility for embedded deployment.

Technology Category

Application Category

📝 Abstract

The potential of facial expression reconstruction technology is significant, with applications in various fields such as human-computer interaction, affective computing, and virtual reality. Recent studies have proposed using ear-worn devices for facial expression reconstruction to address the environmental limitations and privacy concerns associated with traditional camera-based methods. However, these approaches still require improvements in terms of aesthetics and power consumption. This paper introduces a system called IMUFace. It uses inertial measurement units (IMUs) embedded in wireless earphones to detect subtle ear movements caused by facial muscle activities, allowing for covert and low-power facial reconstruction. A user study involving 12 participants was conducted, and a deep learning model named IMUTwinTrans was proposed. The results show that IMUFace can accurately predict users' facial landmarks with a precision of 2.21 mm, using only five minutes of training data. The predicted landmarks can be utilized to reconstruct a three-dimensional facial model. IMUFace operates at a sampling rate of 30 Hz with a relatively low power consumption of 58 mW. The findings presented in this study demonstrate the real-world applicability of IMUFace and highlight potential directions for further research to facilitate its practical adoption.

Problem

Research questions and friction points this paper is trying to address.

Facial Expression Reconstruction

Low Power Consumption

Privacy Protection

Innovation

Methods, ideas, or system contributions that make the work stand out.

IMUFace system

real-time facial expression reconstruction

low-power consumption

🔎 Similar Papers

No similar papers found.

Bosch Group

Renningen, BW, DE

Research Scientist Intern, Machine Perception for Input and Interaction (PhD)