WristPP: A Wrist-Worn System for Hand Pose And Pressure Estimation

📅 2026-02-28
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Accurate simultaneous estimation of 3D hand pose and contact pressure in mobile scenarios remains challenging. This work proposes WristPP, the first wrist-worn system that leverages a single monocular wide-angle RGB image to jointly estimate 3D hand pose and per-vertex contact pressure, enabling markerless tabletop interaction and mid-air pointing. The method employs a Vision Transformer backbone, integrating Hand-VQVAE codebook index prediction with an extrinsic-conditioned branch to unify pose and pressure modeling. Evaluated on a newly collected dataset of 133,000 frames, the system achieves a mean per-joint position error (MPJPE) of 2.9 mm and a Contact IoU of 0.712. User studies demonstrate that its mid-air pointing performance matches that of a touchpad while significantly reducing arm fatigue.

Technology Category

Application Category

📝 Abstract
Accurate 3D hand pose and pressure sensing is essential for immersive human-computer interaction, yet simultaneously achieving both in mobile scenarios remains a significant challenge. We present WristPP, a camera-based wrist-worn system that estimates 3D hand pose and per-vertex pressure from a single wide-FOV RGB frame in real time. A Vision Transformer (ViT) backbone with joint-aligned tokens predicts Hand-VQVAE codebook indices for mesh recovery, while an extrinsics-conditioned branch jointly estimates per-vertex pressure. On a self-collected dataset of 133,000 frames (20 subjects; 48 on-plane and 28 mid-air gestures), WristPP attains a Mean Per-Joint Position Error (MPJPE) of 2.9 mm, Contact IoU of 0.712, Volumetric IoU of 0.618, and foreground pressure MAE of 10.4 g. Across three user studies, WristPP delivers touchpad-level efficiency in mid-air pointing and robust multi-finger pressure control on an uninstrumented desktop. In a real-world large-display Whac-A-Mole task, WristPP also enables higher success ratio and lower arm fatigue than head-mounted camera-based baselines. These results position WristPP as an effective, mobile solution for versatile pose- and pressure-based interaction. Website: https://zhenqis123.github.io/WristPP/.
Problem

Research questions and friction points this paper is trying to address.

hand pose estimation
pressure sensing
human-computer interaction
mobile interaction
3D hand tracking
Innovation

Methods, ideas, or system contributions that make the work stand out.

Wrist-worn sensing
3D hand pose estimation
Pressure estimation
Vision Transformer
Hand-VQVAE
🔎 Similar Papers
No similar papers found.
Ziheng Xi
Ziheng Xi
Undergraduate Student of Department of Automation, Tsinghua University
Machine LearningDeep learningPattern Recognition
Z
Zihang Ao
Department of Automation, Tsinghua University
Y
Yitao Wang
Department of Automation, Tsinghua University
M
Mingeze Gao
Tsinghua University
W
Wanmei Zhang
Tsinghua University
J
Jianjiang Feng
Department of Automation, Tsinghua University
Jie Zhou
Jie Zhou
Tsinghua University
Graph Neural NetworksNatural Language Processing