Motion-Adaptive Multi-Scale Temporal Modelling with Skeleton-Constrained Spatial Graphs for Efficient 3D Human Pose Estimation

📅 2026-04-04

📈 Citations: 0

✨ Influential: 0

career value

209K/year

🤖 AI Summary

This work addresses the limitations of existing methods for 3D human pose estimation from monocular video, which often suffer from inefficient spatiotemporal modeling and poor generalization—particularly in approaches relying on dense attention mechanisms or rigid architectural designs. To overcome these challenges, we propose MASC-Pose, a novel framework that integrates Adaptive Multi-scale Temporal Modeling (AMTM) to effectively capture heterogeneous motion dynamics and introduces a Skeleton-constrained Adaptive Graph Convolutional Network (SAGCN) to enable joint-aware spatial interactions. By synergistically combining temporal and spatial cues in a computationally efficient manner, our method achieves state-of-the-art accuracy on both the Human3.6M and MPI-INF-3DHP benchmarks while significantly improving computational efficiency.

Technology Category

Application Category

📝 Abstract

Accurate 3D human pose estimation from monocular videos requires effective modelling of complex spatial and temporal dependencies. However, existing methods often face challenges in efficiency and adaptability when modelling spatial and temporal dependencies, particularly under dense attention or fixed modelling schemes. In this work, we propose MASC-Pose, a Motion-Adaptive multi-scale temporal modelling framework with Skeleton-Constrained spatial graphs for efficient 3D human pose estimation. Specifically, it introduces an Adaptive Multi-scale Temporal Modelling (AMTM) module to adaptively capture heterogeneous motion dynamics at different temporal scales, together with a Skeleton-constrained Adaptive GCN (SAGCN) for joint-specific spatial interaction modelling. By jointly enabling adaptive temporal reasoning and efficient spatial aggregation, our method achieves strong accuracy with high computational efficiency. Extensive experiments on Human3.6M and MPI-INF-3DHP datasets demonstrate the effectiveness of our approach.

Problem

Research questions and friction points this paper is trying to address.

3D human pose estimation

spatial-temporal dependencies

computational efficiency

motion adaptability

Innovation

Methods, ideas, or system contributions that make the work stand out.

Motion-Adaptive

Multi-Scale Temporal Modelling

Skeleton-Constrained Graph

3D Human Pose Estimation