๐ค AI Summary
Video anomaly detection faces challenges from diverse anomaly types and severe scarcity of labeled anomalies. This paper proposes a self-context-aware one-class few-shot Transformer framework that trains video-specific models using only the initial normal frames of each video. Leveraging self-supervised temporal attention, the model predicts subsequent frame features and localizes anomalies at the frame level via predictionโground-truth feature residuals. Crucially, it requires no anomalous samples, enabling both video-specific modeling and dynamic contextual adaptation. The core innovation lies in deeply integrating self-attention with one-class few-shot temporal forecasting to establish an end-to-end reconstruction-residual detection paradigm. Extensive experiments demonstrate significant improvements over state-of-the-art methods across multiple standard benchmarks. Ablation studies confirm that the self-context mechanism critically enhances both detection accuracy and cross-scenario generalization.
๐ Abstract
Anomaly detection in videos is a challenging task as anomalies in different videos are of different kinds. Therefore, a promising way to approach video anomaly detection is by learning the non-anomalous nature of the video at hand. To this end, we propose a one-class few-shot learning driven transformer based approach for anomaly detection in videos that is self-context aware. Features from the first few consecutive non-anomalous frames in a video are used to train the transformer in predicting the non-anomalous feature of the subsequent frame. This takes place under the attention of a self-context learned from the input features themselves. After the learning, given a few previous frames, the video-specific transformer is used to infer if a frame is anomalous or not by comparing the feature predicted by it with the actual. The effectiveness of the proposed method with respect to the state-of-the-art is demonstrated through qualitative and quantitative results on different standard datasets. We also study the positive effect of the self-context used in our approach.