MotionPRO: Exploring the Role of Pressure in Human MoCap and Beyond

📅 2025-04-07
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Existing motion capture methods prioritize visual similarity while neglecting physical plausibility, leading to drift, sliding, interpenetration, and trajectory inaccuracies in virtual human animation and robot control. This work introduces, for the first time, plantar pressure sensing to explicitly model human–environment interaction, enabling physically grounded motion estimation. We construct MotionPRO—a large-scale dataset comprising 70 subjects, 400 motions, and 12.4 million frames—and propose a novel pressure-driven, sensor-only paradigm for joint pose and global trajectory estimation. Our method innovatively integrates a vertical-axis whole-body contact constraint and a camera-axis orthogonal similarity constraint to enable cross-modal pressure–RGB fusion. Leveraging a small-kernel decoder, long-short-term attention, and physics-aware feature fusion, it supports SMPL-based reconstruction and robot closed-loop control. Experiments show that pressure-only input achieves high-accuracy lower-body pose and trajectory estimation; pressure–RGB fusion reduces MPJPE by 21.3% and ACCEL by 38.7%. The framework enables slip-free virtual human locomotion and stable, precisely localized humanoid robot motion.

Technology Category

Application Category

📝 Abstract
Existing human Motion Capture (MoCap) methods mostly focus on the visual similarity while neglecting the physical plausibility. As a result, downstream tasks such as driving virtual human in 3D scene or humanoid robots in real world suffer from issues such as timing drift and jitter, spatial problems like sliding and penetration, and poor global trajectory accuracy. In this paper, we revisit human MoCap from the perspective of interaction between human body and physical world by exploring the role of pressure. Firstly, we construct a large-scale human Motion capture dataset with Pressure, RGB and Optical sensors (named MotionPRO), which comprises 70 volunteers performing 400 types of motion, encompassing a total of 12.4M pose frames. Secondly, we examine both the necessity and effectiveness of the pressure signal through two challenging tasks: (1) pose and trajectory estimation based solely on pressure: We propose a network that incorporates a small kernel decoder and a long-short-term attention module, and proof that pressure could provide accurate global trajectory and plausible lower body pose. (2) pose and trajectory estimation by fusing pressure and RGB: We impose constraints on orthographic similarity along the camera axis and whole-body contact along the vertical axis to enhance the cross-attention strategy to fuse pressure and RGB feature maps. Experiments demonstrate that fusing pressure with RGB features not only significantly improves performance in terms of objective metrics, but also plausibly drives virtual humans (SMPL) in 3D scene. Furthermore, we demonstrate that incorporating physical perception enables humanoid robots to perform more precise and stable actions, which is highly beneficial for the development of embodied artificial intelligence. Project page is available at: https://nju-cite-mocaphumanoid.github.io/MotionPRO/
Problem

Research questions and friction points this paper is trying to address.

Addressing physical plausibility gaps in human MoCap methods
Improving pose and trajectory estimation using pressure signals
Enhancing virtual human and humanoid robot motion accuracy
Innovation

Methods, ideas, or system contributions that make the work stand out.

Large-scale MoCap dataset with pressure and RGB
Pressure-based pose and trajectory estimation network
Fusion of pressure and RGB for enhanced accuracy
🔎 Similar Papers
No similar papers found.
S
Shenghao Ren
School of Electronic Science and Engineering, Nanjing University, Nanjing, China
Y
Yi Lu
School of Electronic Science and Engineering, Nanjing University, Nanjing, China
J
Jiayi Huang
School of Electronic Science and Engineering, Nanjing University, Nanjing, China
J
Jiayi Zhao
School of Electronic Science and Engineering, Nanjing University, Nanjing, China
H
He Zhang
BNRist, Tsinghua University, Beijing, China
T
Tao Yu
BNRist, Tsinghua University, Beijing, China
Qiu Shen
Qiu Shen
Nanjing University
Xun Cao
Xun Cao
Nanjing University
Computational PhotographyComputational ImagingImage & Video Processing