Generative Data Augmentation for Skeleton Action Recognition

📅 2026-04-16
📈 Citations: 0
Influential: 0
📄 PDF

career value

172K/year
🤖 AI Summary
This work addresses the scarcity of high-quality labeled data in 3D skeletal action recognition by proposing a conditional generative data augmentation approach constrained by action labels. Leveraging a Transformer-based encoder-decoder architecture, the method integrates a generation refinement module and a dropout mechanism to effectively balance fidelity and diversity during sequence sampling. The resulting synthetic skeletal sequences exhibit both high realism and substantial variability, consistently enhancing the performance of diverse action recognition models under both few-shot and full-data settings. Extensive experiments on the HumanAct12 and NTU-VIBE datasets demonstrate the effectiveness and generalizability of the proposed augmentation strategy.

Technology Category

Application Category

📝 Abstract
Skeleton-based human action recognition is a powerful approach for understanding human behaviour from pose data, but collecting large-scale, diverse, and well-annotated 3D skeleton datasets is both expensive and labor-intensive. To address this challenge, we propose a conditional generative pipeline for data augmentation in skeleton action recognition. Our method learns the distribution of real skeleton sequences under the constraint of action labels, enabling the synthesis of diverse and high-fidelity data. Even with limited training samples, it can effectively generate skeleton sequences and achieve competitive recognition performance in low-data scenarios, demonstrating strong generalisation in downstream tasks. Specifically, we introduce a Transformer-based encoder-decoder architecture, combined with a generative refinement module and a dropout mechanism, to balance fidelity and diversity during sampling. Experiments on HumanAct12 and the refined NTU-RGBD (NTU-VIBE) dataset show that our approach consistently improves the accuracy of multiple skeleton-based action recognition models, validating its effectiveness in both few-shot and full-data settings. The source code can be found at here.
Problem

Research questions and friction points this paper is trying to address.

skeleton action recognition
data augmentation
3D skeleton data
limited training data
human behaviour understanding
Innovation

Methods, ideas, or system contributions that make the work stand out.

conditional generative model
skeleton action recognition
data augmentation
Transformer-based architecture
few-shot learning
🔎 Similar Papers
No similar papers found.