Learning semantical dynamics and spatiotemporal collaboration for human pose estimation in video

📅 2025-02-01

🏛️ Neurocomputing

📈 Citations: 0

✨ Influential: 0

career value

161K/year

🤖 AI Summary

To address insufficient temporal robustness in video-based human pose estimation caused by image degradations (e.g., occlusion, motion blur), this paper proposes a novel framework jointly modeling semantic dynamic evolution and spatiotemporal collaboration. Our method introduces: (1) a learnable semantic state transition module that explicitly captures inter-frame semantic evolution of joint states; and (2) a bidirectional spatiotemporal graph co-propagation mechanism, integrating GCN-based spatial modeling with GRU-based temporal modeling, enhanced by semantic attention and cross-frame topological consistency constraints. Evaluated on PoseTrack18 and JTA, our approach achieves 78.3% and 82.6% mAP, respectively, while reducing temporal jitter by 37%—outperforming current state-of-the-art methods.