Unsupervised Cross-Domain 3D Human Pose Estimation via Pseudo-Label-Guided Global Transforms

📅 2025-04-17
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Cross-domain 3D human pose estimation suffers from significant domain shift caused by global factors such as camera viewpoint and position, leading to degraded performance. To address this, we propose a pseudo-label-guided global coordinate transformation framework for unsupervised source-to-target domain adaptation. Our method introduces a novel global spatial alignment mechanism based on a human-centered coordinate system, effectively decoupling global body pose from local joint relationships. We further design a joint training paradigm integrating iterative pseudo-label generation with pose-aware data augmentation. Notably, our approach is the first to surpass fully supervised target-domain models under purely unsupervised settings. Extensive cross-dataset evaluations on Human3.6M, MPI-INF-3DHP, and 3DPW demonstrate consistent superiority over state-of-the-art methods, validating both effectiveness and generalizability.

Technology Category

Application Category

📝 Abstract
Existing 3D human pose estimation methods often suffer in performance, when applied to cross-scenario inference, due to domain shifts in characteristics such as camera viewpoint, position, posture, and body size. Among these factors, camera viewpoints and locations {have been shown} to contribute significantly to the domain gap by influencing the global positions of human poses. To address this, we propose a novel framework that explicitly conducts global transformations between pose positions in the camera coordinate systems of source and target domains. We start with a Pseudo-Label Generation Module that is applied to the 2D poses of the target dataset to generate pseudo-3D poses. Then, a Global Transformation Module leverages a human-centered coordinate system as a novel bridging mechanism to seamlessly align the positional orientations of poses across disparate domains, ensuring consistent spatial referencing. To further enhance generalization, a Pose Augmentor is incorporated to address variations in human posture and body size. This process is iterative, allowing refined pseudo-labels to progressively improve guidance for domain adaptation. Our method is evaluated on various cross-dataset benchmarks, including Human3.6M, MPI-INF-3DHP, and 3DPW. The proposed method outperforms state-of-the-art approaches and even outperforms the target-trained model.
Problem

Research questions and friction points this paper is trying to address.

Addressing domain shifts in cross-scenario 3D human pose estimation
Aligning global pose positions across different camera viewpoints
Improving generalization for posture and body size variations
Innovation

Methods, ideas, or system contributions that make the work stand out.

Pseudo-label-guided global transforms for 3D poses
Human-centered coordinate system for cross-domain alignment
Iterative pose augmentation to enhance generalization
🔎 Similar Papers
No similar papers found.
J
Jingjing Liu
School of Computer Science, University of Bristol, United Kingdom
Z
Zhiyong Wang
State Key Laboratory of Robotics and Systems, Harbin Institute of Technology Shenzhen, China
X
Xinyu Fan
School of Aerospace Engineering, Xiamen University, China
Amirhossein Dadashzadeh
Amirhossein Dadashzadeh
University of Bristol
Computer VisionMachine Learning
Honghai Liu
Honghai Liu
Portsmouth University
Human-Machine SystemsMulti-Sensory Data Fusion and Information AnalyticsBio-MechatronicsPattern RecognitionIntelligent Robotics
Majid Mirmehdi
Majid Mirmehdi
Professor of Computer Vision, FIAPR, FBMVA, University of Bristol
Computer Vision and Pattern Recognition