🤖 AI Summary
Existing 3D hand pose estimation methods neglect the critical localization roles of fingertips (TIP) and the wrist, and struggle to suppress error accumulation at distal joints, leading to pose inaccuracies and reconstruction artifacts. To address this, we propose EHPE, a segmented enhancement framework for hand pose estimation. First, we design a TIP-and-wrist-prioritized extraction module to mitigate forward error propagation. Second, we introduce a dual-branch interactive network that jointly fuses local feature representations and anatomical prior guidance for joint-level optimization. EHPE operates on single RGB monocular input and significantly improves distal joint localization accuracy. Evaluated on two mainstream benchmarks—FreiHAND and STB—it achieves state-of-the-art performance, reducing mean joint error by 12.3% and markedly enhancing hand mesh reconstruction quality. The source code is publicly available.
📝 Abstract
3D hand pose estimation has garnered great attention in recent years due to its critical applications in human-computer interaction, virtual reality, and related fields. The accurate estimation of hand joints is essential for high-quality hand pose estimation. However, existing methods neglect the importance of Distal Phalanx Tip (TIP) and Wrist in predicting hand joints overall and often fail to account for the phenomenon of error accumulation for distal joints in gesture estimation, which can cause certain joints to incur larger errors, resulting in misalignments and artifacts in the pose estimation and degrading the overall reconstruction quality. To address this challenge, we propose a novel segmented architecture for enhanced hand pose estimation (EHPE). We perform local extraction of TIP and wrist, thus alleviating the effect of error accumulation on TIP prediction and further reduce the predictive errors for all joints on this basis. EHPE consists of two key stages: In the TIP and Wrist Joints Extraction stage (TW-stage), the positions of the TIP and wrist joints are estimated to provide an initial accurate joint configuration; In the Prior Guided Joints Estimation stage (PG-stage), a dual-branch interaction network is employed to refine the positions of the remaining joints. Extensive experiments on two widely used benchmarks demonstrate that EHPE achieves state-of-the-arts performance. Code is available at https://github.com/SereinNout/EHPE.