DeltaDorsal: Enhancing Hand Pose Estimation with Dorsal Features in Egocentric Views

📅 2026-01-21

📈 Citations: 0

✨ Influential: 0

career value

201K/year

🤖 AI Summary

This work addresses the challenge of degraded accuracy in first-person hand pose estimation caused by frequent finger occlusions. The authors propose a novel dual-stream Delta encoder architecture that operates solely on cropped dorsal-hand images, inferring pose by contrasting skin deformation patterns between dynamic hand configurations and a relaxed reference state. This approach achieves high-precision pose estimation under severe occlusion (≥50%) without relying on full-hand geometry or large-scale models—a first in the field. It reduces the mean joint angle error by 18%, substantially improving reliability for downstream tasks such as pinching and tapping, while also enabling isometric force interaction without visible motion. Furthermore, the method achieves these advances with a reduced model footprint.

Technology Category

Application Category

📝 Abstract

The proliferation of XR devices has made egocentric hand pose estimation a vital task, yet this perspective is inherently challenged by frequent finger occlusions. To address this, we propose a novel approach that leverages the rich information in dorsal hand skin deformation, unlocked by recent advances in dense visual featurizers. We introduce a dual-stream delta encoder that learns pose by contrasting features from a dynamic hand with a baseline relaxed position. Our evaluation demonstrates that, using only cropped dorsal images, our method reduces the Mean Per Joint Angle Error (MPJAE) by 18% in self-occluded scenarios (fingers>= 50% occluded) compared to state-of-the-art techniques that depend on the whole hand's geometry and large model backbones. Consequently, our method not only enhances the reliability of downstream tasks like index finger pinch and tap estimation in occluded scenarios but also unlocks new interaction paradigms, such as detecting isometric force for a surface"click"without visible movement while minimizing model size.

Problem

Research questions and friction points this paper is trying to address.

hand pose estimation

egocentric view

finger occlusion

dorsal features

XR devices

Innovation

Methods, ideas, or system contributions that make the work stand out.

dorsal hand features

egocentric hand pose estimation

delta encoder