2D Pre-Training for 3D Pose Estimation

📅 2026-04-19
📈 Citations: 0
Influential: 0
📄 PDF

career value

188K/year
🤖 AI Summary
This work addresses the limited generalization of existing 3D human pose estimation methods, which often rely on pretraining with scarce 3D data. Departing from the conventional paradigm of using a single strong 3D benchmark for pretraining, this study systematically investigates the impact of 2D pose pretraining on downstream 3D tasks. By integrating diverse 2D and 3D datasets—including MPII, Human3.6M, and Occlusion Person—the approach demonstrates robust effectiveness across multi-dataset scenarios. Experimental results show that the proposed 2D pretraining strategy significantly outperforms purely 3D-trained baselines, achieving mean per-joint position errors (MPJPE) below 64.5 mm on both MPII and Human3.6M. Furthermore, it enhances model generalization and computational efficiency, while an in-depth analysis elucidates the influence of factors such as model scale on overall performance.

Technology Category

Application Category

📝 Abstract
Pre-training is a general method that is used in a range of deep learning tasks. By first training a model on one task, and then further training on the downstream task used for final evaluation, the model is forced to learn a more general understanding of the input data. While pre-training has been applied to 3D Human Pose Estimation (HPE) previously, the scope of datasets used is typically very limited to some strong benchmarks, like Human3.6M. Therefore, in this project, we expand the scope of an existing 3D HPE scheme to be compatible with additional 2D and 3D HPE datasets, like Occlusion Person. We perform an extensive study on how aspects of 2D pre-training, such as model size, affect downstream performance, and to what extent pre-training can help the model generalize to different datasets. Experimental results show that 2D pre-training consistently outperforms training on 3D data alone, particularly in terms of computational efficiency. Finally, using MPII and Human3.6M, we are able to obtain an MPJPE score of under 64.5mm.
Problem

Research questions and friction points this paper is trying to address.

3D Human Pose Estimation
Pre-training
Dataset Generalization
2D Pose Data
Innovation

Methods, ideas, or system contributions that make the work stand out.

2D pre-training
3D human pose estimation
cross-dataset generalization
computational efficiency
MPJPE