Few-shot Human Action Anomaly Detection via a Unified Contrastive Learning Framework

📅 2025-08-25

📈 Citations: 0

✨ Influential: 0

career value

179K/year

🤖 AI Summary

Existing one-class, single-model action anomaly detection methods suffer from poor scalability and heavy reliance on large volumes of normal training samples. To address these limitations, this paper proposes a few-shot generic contrastive learning framework that constructs a class-agnostic unified representation space, enabling cross-category anomaly detection using only a small support set of normal action sequences. A key innovation is the integration of a generative motion augmentation strategy grounded in diffusion-based foundation models, which synthesizes diverse, high-fidelity normal action sequences to enhance intra-class robustness and cross-category generalization. Evaluated on the HumanAct12 benchmark, our method achieves state-of-the-art performance under both seen and unseen category settings, while significantly reducing computational and data requirements. This enables rapid adaptation to novel action categories and effective deployment in data-scarce scenarios.

Technology Category

Application Category

📝 Abstract

Human Action Anomaly Detection (HAAD) aims to identify anomalous actions given only normal action data during training. Existing methods typically follow a one-model-per-category paradigm, requiring separate training for each action category and a large number of normal samples. These constraints hinder scalability and limit applicability in real-world scenarios, where data is often scarce or novel categories frequently appear. To address these limitations, we propose a unified framework for HAAD that is compatible with few-shot scenarios. Our method constructs a category-agnostic representation space via contrastive learning, enabling AD by comparing test samples with a given small set of normal examples (referred to as the support set). To improve inter-category generalization and intra-category robustness, we introduce a generative motion augmentation strategy harnessing a diffusion-based foundation model for creating diverse and realistic training samples. Notably, to the best of our knowledge, our work is the first to introduce such a strategy specifically tailored to enhance contrastive learning for action AD. Extensive experiments on the HumanAct12 dataset demonstrate the state-of-the-art effectiveness of our approach under both seen and unseen category settings, regarding training efficiency and model scalability for few-shot HAAD.

Problem

Research questions and friction points this paper is trying to address.

Detects human action anomalies with few normal samples

Eliminates separate training per action category requirement

Addresses scalability and data scarcity in real applications

Innovation

Methods, ideas, or system contributions that make the work stand out.

Unified contrastive learning for few-shot anomaly detection

Generative motion augmentation using diffusion foundation model

Category-agnostic representation space construction via contrastive learning

🔎 Similar Papers

No similar papers found.

Bosch Group

Renningen, BW, DE

Master Thesis Deep Learning for Multi-Channel Vision in Product Manufacturing

Bosch Group

Renningen, BW, DE

Research Scientist Intern, Multimodal Generative AI and Robotics (PhD)