UDAPose: Unsupervised Domain Adaptation for Low-Light Human Pose Estimation

📅 2026-04-12

📈 Citations: 0

✨ Influential: 0

career value

174K/year

🤖 AI Summary

This work addresses the significant performance degradation of human pose estimation under low-light conditions, primarily caused by scarce annotated data and degraded visual information. To tackle this challenge, the authors propose an unsupervised domain adaptation framework that synthesizes photorealistic low-light images and integrates them within a Transformer architecture through a DC-component-based high-pass filter (DHF), a low-light feature injection module (LCIM), and a dynamic attention control mechanism (DCA). This design enables adaptive fusion of visual cues and pose priors while effectively preserving high-frequency details and enhancing robustness. The method achieves state-of-the-art results, improving AP by 10.1 points to 56.4% on the ExLPose-test hard set and by 7.4 points to 31.4% on the cross-dataset EHPT-XC benchmark, substantially outperforming existing approaches.

Technology Category

Application Category

📝 Abstract

Low-visibility scenarios, such as low-light conditions, pose significant challenges to human pose estimation due to the scarcity of annotated low-light datasets and the loss of visual information under poor illumination. Recent domain adaptation techniques attempt to utilize well-lit labels by augmenting well-lit images to mimic low-light conditions. But handcrafted augmentations oversimplify noise patterns, while learning-based methods often fail to preserve high-frequency low-light characteristics, producing unrealistic images that lead pose models to generalize poorly to real low-light scenes. Moreover, recent pose estimators rely on image cues through image-to-keypoint cross-attention, but these cues become unreliable under low-light conditions. To address these issues, we propose Unsupervised Domain Adaptation for Pose Estimation (UDAPose), a novel framework that synthesizes low-light images and dynamically fuses visual cues with pose priors for improved pose estimation. Specifically, our synthesis method incorporates a Direct-Current-based High-Pass Filter (DHF) and a Low-light Characteristics Injection Module (LCIM) to inject high-frequency details from input low-light images, overcoming rigidity or the detail loss in existing approaches. Furthermore, we introduce a Dynamic Control of Attention (DCA) module that adaptively balances image cues with learned pose priors in the Transformer architecture. Experiments show that UDAPose outperforms state-of-the-art methods, with notable AP gains of 10.1 (56.4%) on the ExLPose-test hard set (LL-H) and 7.4 (31.4%) in cross-dataset validation on EHPT-XC. Code: https://github.com/Vision-and-Multimodal-Intelligence-Lab/UDAPose

Problem

Research questions and friction points this paper is trying to address.

low-light

human pose estimation

unsupervised domain adaptation

visual information loss

image-to-keypoint attention

Innovation

Methods, ideas, or system contributions that make the work stand out.

Unsupervised Domain Adaptation

Low-Light Pose Estimation

High-Frequency Detail Preservation