Progressive Inertial Poser: Progressive Real-Time Kinematic Chain Estimation for 3D Full-Body Pose from Three IMU Sensors

📅 2025-05-08
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the need for hardware-lightweight and environment-robust full-body pose estimation in VR. We propose a novel real-time method that reconstructs 3D full-body pose using only three IMUs—mounted on the head and both wrists. Unlike mainstream approaches relying on pelvic/lower-limb sensors or external vision, our method employs a progressive multi-stage network: a Transformer-enhanced bidirectional LSTM (TE-biLSTM) encoder, coupled with SMPL parameter regression, biomechanical priors, and hierarchical kinematic chain optimization to enforce whole-body motion constraints without lower-body sensing. To our knowledge, this is the first method achieving near-6-IMU accuracy under a 3-IMU configuration. It outperforms prior state-of-the-art methods with identical input modalities across multiple public benchmarks, reducing mean joint error by 12.7% and achieving end-to-end latency under 15 ms—significantly enhancing wearability and practical deployment in VR applications.

Technology Category

Application Category

📝 Abstract
The motion capture system that supports full-body virtual representation is of key significance for virtual reality. Compared to vision-based systems, full-body pose estimation from sparse tracking signals is not limited by environmental conditions or recording range. However, previous works either face the challenge of wearing additional sensors on the pelvis and lower-body or rely on external visual sensors to obtain global positions of key joints. To improve the practicality of the technology for virtual reality applications, we estimate full-body poses using only inertial data obtained from three Inertial Measurement Unit (IMU) sensors worn on the head and wrists, thereby reducing the complexity of the hardware system. In this work, we propose a method called Progressive Inertial Poser (ProgIP) for human pose estimation, which combines neural network estimation with a human dynamics model, considers the hierarchical structure of the kinematic chain, and employs a multi-stage progressive network estimation with increased depth to reconstruct full-body motion in real time. The encoder combines Transformer Encoder and bidirectional LSTM (TE-biLSTM) to flexibly capture the temporal dependencies of the inertial sequence, while the decoder based on multi-layer perceptrons (MLPs) transforms high-dimensional features and accurately projects them onto Skinned Multi-Person Linear (SMPL) model parameters. Quantitative and qualitative experimental results on multiple public datasets show that our method outperforms state-of-the-art methods with the same inputs, and is comparable to recent works using six IMU sensors.
Problem

Research questions and friction points this paper is trying to address.

Estimates full-body poses using only three IMU sensors
Combines neural networks with human dynamics model
Reconstructs real-time motion without lower-body sensors
Innovation

Methods, ideas, or system contributions that make the work stand out.

Uses only three IMU sensors on head and wrists
Combines neural network with human dynamics model
Employs Transformer Encoder and bidirectional LSTM
🔎 Similar Papers
No similar papers found.
Z
Zunjie Zhu
School of Communication Engineering, Hangzhou Dianzi University, Hangzhou 323000, China; Key Laboratory of Micro-nano Sensing and IoT of Wenzhou, Wenzhou Institute of Hangzhou Dianzi University, Wenzhou, 325038, China
Y
Yan Zhao
School of Control Science and Engineering, Tiangong University, Tianjin 300387, China
Y
Yihan Hu
School of Communication Engineering, Hangzhou Dianzi University, Hangzhou 323000, China
G
Guoxiang Wang
College of Business, Lishui University, Lishui 310018, China
Hai Qiu
Hai Qiu
Costar Intelligent Optoelectronics Technology Co., Ltd, China
Bolun Zheng
Bolun Zheng
Hangzhou Dianzi Universiy
multimediacomputer vision
Chenggang Yan
Chenggang Yan
Hangzhou Dianzi University
F
Feng Xu
School of software and BNRist, Tsinghua University, Beijing 100084, China