🤖 AI Summary
Cross-domain 3D human pose estimation suffers from significant domain shift caused by global factors such as camera viewpoint and position, leading to degraded performance. To address this, we propose a pseudo-label-guided global coordinate transformation framework for unsupervised source-to-target domain adaptation. Our method introduces a novel global spatial alignment mechanism based on a human-centered coordinate system, effectively decoupling global body pose from local joint relationships. We further design a joint training paradigm integrating iterative pseudo-label generation with pose-aware data augmentation. Notably, our approach is the first to surpass fully supervised target-domain models under purely unsupervised settings. Extensive cross-dataset evaluations on Human3.6M, MPI-INF-3DHP, and 3DPW demonstrate consistent superiority over state-of-the-art methods, validating both effectiveness and generalizability.
📝 Abstract
Existing 3D human pose estimation methods often suffer in performance, when applied to cross-scenario inference, due to domain shifts in characteristics such as camera viewpoint, position, posture, and body size. Among these factors, camera viewpoints and locations {have been shown} to contribute significantly to the domain gap by influencing the global positions of human poses. To address this, we propose a novel framework that explicitly conducts global transformations between pose positions in the camera coordinate systems of source and target domains. We start with a Pseudo-Label Generation Module that is applied to the 2D poses of the target dataset to generate pseudo-3D poses. Then, a Global Transformation Module leverages a human-centered coordinate system as a novel bridging mechanism to seamlessly align the positional orientations of poses across disparate domains, ensuring consistent spatial referencing. To further enhance generalization, a Pose Augmentor is incorporated to address variations in human posture and body size. This process is iterative, allowing refined pseudo-labels to progressively improve guidance for domain adaptation. Our method is evaluated on various cross-dataset benchmarks, including Human3.6M, MPI-INF-3DHP, and 3DPW. The proposed method outperforms state-of-the-art approaches and even outperforms the target-trained model.