Learning semantical dynamics and spatiotemporal collaboration for human pose estimation in video

๐Ÿ“… 2025-02-01
๐Ÿ›๏ธ Neurocomputing
๐Ÿ“ˆ Citations: 0
โœจ Influential: 0
๐Ÿ“„ PDF
๐Ÿค– AI Summary
To address insufficient temporal robustness in video-based human pose estimation caused by image degradations (e.g., occlusion, motion blur), this paper proposes a novel framework jointly modeling semantic dynamic evolution and spatiotemporal collaboration. Our method introduces: (1) a learnable semantic state transition module that explicitly captures inter-frame semantic evolution of joint states; and (2) a bidirectional spatiotemporal graph co-propagation mechanism, integrating GCN-based spatial modeling with GRU-based temporal modeling, enhanced by semantic attention and cross-frame topological consistency constraints. Evaluated on PoseTrack18 and JTA, our approach achieves 78.3% and 82.6% mAP, respectively, while reducing temporal jitter by 37%โ€”outperforming current state-of-the-art methods.

Technology Category

Application Category

Problem

Research questions and friction points this paper is trying to address.

Enhancing video-based human pose estimation
Improving semantical dynamics understanding
Optimizing spatio-temporal feature collaboration
Innovation

Methods, ideas, or system contributions that make the work stand out.

Multi-Level Semantic Motion Encoder
Spatial-Motion Mutual Learning
multi-masked context strategy
๐Ÿ”Ž Similar Papers
No similar papers found.