Leveraging Synthetic Adult Datasets for Unsupervised Infant Pose Estimation

📅 2025-04-08

📈 Citations: 0

✨ Influential: 0

career value

222K/year

🤖 AI Summary

To address the scarcity of ground-truth annotations and poor cross-domain generalization in infant pose estimation, this paper introduces SHIFT—the first unsupervised domain adaptation framework tailored for infants, leveraging readily available synthetic adult pose data to supervise infant pose learning. Methodologically, SHIFT integrates three novel constraints: (1) a Mean-Teacher-based pseudo-labeling mechanism to enhance model confidence; (2) anatomically and kinematically informed manifold priors that explicitly encode infant-specific pose structure and motion characteristics; and (3) a visibility-consistency loss to improve robustness at critical joints. Evaluated across multiple benchmarks, SHIFT outperforms existing unsupervised methods by 5% and even surpasses fully supervised infant pose models by 16%, demonstrating substantial gains in accuracy and robustness under low-data and cross-distribution settings.

Technology Category

Application Category

📝 Abstract

Human pose estimation is a critical tool across a variety of healthcare applications. Despite significant progress in pose estimation algorithms targeting adults, such developments for infants remain limited. Existing algorithms for infant pose estimation, despite achieving commendable performance, depend on fully supervised approaches that require large amounts of labeled data. These algorithms also struggle with poor generalizability under distribution shifts. To address these challenges, we introduce SHIFT: Leveraging SyntHetic Adult Datasets for Unsupervised InFanT Pose Estimation, which leverages the pseudo-labeling-based Mean-Teacher framework to compensate for the lack of labeled data and addresses distribution shifts by enforcing consistency between the student and the teacher pseudo-labels. Additionally, to penalize implausible predictions obtained from the mean-teacher framework, we incorporate an infant manifold pose prior. To enhance SHIFT's self-occlusion perception ability, we propose a novel visibility consistency module for improved alignment of the predicted poses with the original image. Extensive experiments on multiple benchmarks show that SHIFT significantly outperforms existing state-of-the-art unsupervised domain adaptation (UDA) pose estimation methods by 5% and supervised infant pose estimation methods by a margin of 16%. The project page is available at: https://sarosijbose.github.io/SHIFT.

Problem

Research questions and friction points this paper is trying to address.

Unsupervised infant pose estimation lacks labeled data

Existing methods struggle with distribution shifts

Improving accuracy and generalizability in infant pose prediction

Innovation

Methods, ideas, or system contributions that make the work stand out.

Uses pseudo-labeling Mean-Teacher framework

Incorporates infant manifold pose prior

Proposes visibility consistency module

🔎 Similar Papers

Automatic infant 2D pose estimation from videos: comparing seven deep neural network methods