GenHPE: Generative Counterfactuals for 3D Human Pose Estimation with Radio Frequency Signals

📅 2025-03-12

📈 Citations: 0

✨ Influential: 0

career value

217K/year

🤖 AI Summary

Existing RF-based 3D human pose estimation (HPE) methods are confined to single-domain settings and suffer from poor generalization to unseen subjects or environments, primarily due to entanglement between subject-specific anatomical responses and environment-induced noise. To address this, we propose the first skeleton-label-conditioned counterfactual RF signal generation framework, which explicitly models signal discrepancies to disentangle subject- and environment-specific factors, enabling domain-invariant representation learning. Our approach integrates conditional generative modeling, counterfactual intervention, and a multi-band (Wi-Fi/UWB/mmWave) encoder-decoder architecture. Evaluated on three public benchmarks, it achieves substantial improvements over state-of-the-art methods: reducing cross-subject pose error by 52.2 mm and cross-environment error by 10.6 mm.

Technology Category

Application Category

📝 Abstract

Human pose estimation (HPE) detects the positions of human body joints for various applications. Compared to using cameras, HPE using radio frequency (RF) signals is non-intrusive and more robust to adverse conditions, exploiting the signal variations caused by human interference. However, existing studies focus on single-domain HPE confined by domain-specific confounders, which cannot generalize to new domains and result in diminished HPE performance. Specifically, the signal variations caused by different human body parts are entangled, containing subject-specific confounders. RF signals are also intertwined with environmental noise, involving environment-specific confounders. In this paper, we propose GenHPE, a 3D HPE approach that generates counterfactual RF signals to eliminate domain-specific confounders. GenHPE trains generative models conditioned on human skeleton labels, learning how human body parts and confounders interfere with RF signals. We manipulate skeleton labels (i.e., removing body parts) as counterfactual conditions for generative models to synthesize counterfactual RF signals. The differences between counterfactual signals approximately eliminate domain-specific confounders and regularize an encoder-decoder model to learn domain-independent representations. Such representations help GenHPE generalize to new subjects/environments for cross-domain 3D HPE. We evaluate GenHPE on three public datasets from WiFi, ultra-wideband, and millimeter wave. Experimental results show that GenHPE outperforms state-of-the-art methods and reduces estimation errors by up to 52.2mm for cross-subject HPE and 10.6mm for cross-environment HPE.

Problem

Research questions and friction points this paper is trying to address.

Eliminates domain-specific confounders in RF-based 3D human pose estimation.

Generates counterfactual RF signals to improve cross-domain generalization.

Reduces estimation errors for cross-subject and cross-environment HPE.

Innovation

Methods, ideas, or system contributions that make the work stand out.

Generates counterfactual RF signals for 3D HPE

Uses generative models conditioned on skeleton labels

Eliminates domain-specific confounders for cross-domain HPE

🔎 Similar Papers

No similar papers found.