🤖 AI Summary
Markerless human pose estimation (HPE) suffers from keypoint misidentification and trajectory jitter, while existing deep learning models are constrained by noise inherent in manually annotated ground truth. To address these issues, this paper proposes a joint-angle-based pose optimization framework. First, a geometrically consistent joint-angle representation is constructed, and high-order Fourier series are employed to temporally fit authentic motion trajectories, generating high-fidelity synthetic ground truth. Second, a bidirectional recurrent neural network is designed to perform spatiotemporal refinement of HRNet’s output. By eliminating reliance on error-prone manual annotations, the method significantly improves keypoint localization accuracy and trajectory smoothness—particularly for dynamic, complex motions such as figure skating and breakdancing. Extensive experiments demonstrate state-of-the-art performance on pose refinement tasks, outperforming current SOTA approaches.
📝 Abstract
Marker-free human pose estimation (HPE) has found increasing applications in various fields. Current HPE suffers from occasional errors in keypoint recognition and random fluctuation in keypoint trajectories when analyzing kinematic human poses. The performance of existing deep learning-based models for HPE refinement is considerably limited by inaccurate training datasets in which the keypoints are manually annotated. This paper proposed a novel method to overcome the difficulty through joint angle-based modeling. The key techniques include: (i) A joint angle-based model of human pose, which is robust to describe kinematic human poses; (ii) Approximating temporal variation of joint angles through high order Fourier series to get reliable "ground truth"; (iii) A bidirectional recurrent network is designed as a post-processing module to refine the estimation of well-established HRNet. Trained with the high-quality dataset constructed using our method, the network demonstrates outstanding performance to correct wrongly recognized joints and smooth their spatiotemporal trajectories. Tests show that joint angle-based refinement (JAR) outperforms the state-of-the-art HPE refinement network in challenging cases like figure skating and breaking.