Enhancing Domain Generalization in 3D Human Pose Estimation through Controllable Generative Augmentation

📅 2026-05-12

📈 Citations: 0

✨ Influential: 0

career value

194K/year

🤖 AI Summary

This work addresses the limited domain generalization capability in 3D human pose estimation caused by distribution shifts between training and testing data. To bridge this gap under realistic deployment conditions, the authors propose a controllable generative augmentation strategy that leverages a multi-dimensional controllable human pose video synthesis framework. By systematically modulating pose configurations, background scenes, and camera viewpoints, the method generates diverse cross-domain data spanning indoor/real and outdoor/virtual environments, which are then fused for model enhancement. This study is the first to introduce controllable video generation into domain generalization for 3D pose estimation, demonstrating significant performance gains across multiple unseen datasets and scenarios, thereby validating the efficacy of generative data augmentation in improving cross-domain generalization.

📝 Abstract

Pedestrian motion, due to its causal nature, is strongly influenced by domain gaps arising from discrepancies between training and testing data distributions. Focusing on 3D human pose estimation, this work presents a controllable human pose generation framework that synthesizes diverse video data by systematically varying poses, backgrounds, and camera viewpoints. This generative augmentation enriches training datasets, enhances model generalization, and alleviates the limitations of existing methods in handling domain discrepancies. By leveraging both indoor/real-world and outdoor/virtual datasets, we perform cross-domain data fusion and controllable video generation to construct enriched training data, tailored to realistic deployment settings. Extensive experiments show that the augmented datasets significantly improve model performance on unseen scenarios and datasets, validating the effectiveness of the proposed approach.

Problem

Research questions and friction points this paper is trying to address.

domain generalization

3D human pose estimation

domain gap

data distribution discrepancy

Innovation

Methods, ideas, or system contributions that make the work stand out.

controllable generative augmentation

domain generalization

3D human pose estimation

cross-domain data fusion