MobilePoser: Real-Time Full-Body Pose Estimation and 3D Human Translation from IMUs in Mobile Consumer Devices

📅 2024-10-13

🏛️ ACM Symposium on User Interface Software and Technology

📈 Citations: 2

✨ Influential: 0

career value

213K/year

🤖 AI Summary

To address challenges—including poor linearity, temporal inconsistency, noise accumulation, and global translational drift—arising from low-precision IMUs embedded in consumer-grade devices (e.g., smartphones, smartwatches, earphones) for full-body pose capture, this paper proposes an end-to-end framework integrating a multi-stage lightweight neural network with a physics-driven motion optimizer. The method operates sensor-fusion-free, directly processing raw IMU measurements and modeling temporal sequences in real time to achieve millisecond-latency full-body joint pose estimation and centimeter-accurate global 3D translation reconstruction on-device. Key innovations include: (i) the first incorporation of differentiable physical constraints into a lightweight neural architecture to explicitly model biomechanical dynamics and suppress drift; and (ii) stage-wise spatiotemporal feature disentanglement to enhance temporal consistency. Evaluated across diverse real-world scenarios, the approach achieves state-of-the-art accuracy and has been successfully deployed in health monitoring, immersive gaming, and indoor navigation applications.

Technology Category

Application Category

📝 Abstract

There has been a continued trend towards minimizing instrumentation for full-body motion capture, going from specialized rooms and equipment, to arrays of worn sensors and recently sparse inertial pose capture methods. However, as these techniques migrate towards lower-fidelity IMUs on ubiquitous commodity devices, like phones, watches, and earbuds, challenges arise including compromised online performance, temporal consistency, and loss of global translation due to sensor noise and drift. Addressing these challenges, we introduce MobilePoser, a real-time system for full-body pose and global translation estimation using any available subset of IMUs already present in these consumer devices. MobilePoser employs a multi-stage deep neural network for kinematic pose estimation followed by a physics-based motion optimizer, achieving state-of-the-art accuracy while remaining lightweight. We conclude with a series of demonstrative applications to illustrate the unique potential of MobilePoser across a variety of fields, such as health and wellness, gaming, and indoor navigation to name a few.

Problem

Research questions and friction points this paper is trying to address.

Real-time full-body pose estimation from mobile IMUs

Addressing sensor noise and drift in consumer devices

Achieving accurate 3D human translation without specialized equipment

Innovation

Methods, ideas, or system contributions that make the work stand out.

Real-time full-body pose estimation from IMUs

Multi-stage deep neural network for kinematics

Physics-based motion optimizer for accuracy

🔎 Similar Papers

No similar papers found.