🤖 AI Summary
This work addresses the challenge of deploying video anomaly detection in human-centric scenarios, where sensitive information such as facial identity often impedes real-world application. To mitigate privacy risks without compromising detection performance, the authors propose a privacy-preserving approach based on orthogonal subspace projection. Their method employs a lightweight Orthogonal Projection Layer (OPL) and a guided variant (G-OPL) to explicitly remove task-irrelevant features—such as facial appearance—while preserving discriminative cues like pose and motion dynamics. Notably, this is achieved without requiring identity labels or adversarial training. By integrating weakly supervised learning with a cosine alignment objective, the framework jointly optimizes anomaly detection accuracy and privacy preservation. Extensive experiments demonstrate that the proposed method significantly reduces the risk of sensitive information leakage across multiple benchmarks while maintaining or even improving anomaly detection performance.
📝 Abstract
Video anomaly detection (VAD) systems often prioritize accuracy while overlooking privacy concerns, limiting their suitability for real-world deployment. We propose the Orthogonal Projection Layer (OPL), a lightweight module that removes task-irrelevant variations to produce representations focused on anomaly-relevant cues. To address privacy risks in human-centered scenarios, we introduce Guided OPL (G-OPL), which suppresses facial attributes using weak supervision from face-presence signals while preserving non-identifying features such as pose and motion. A cosine alignment objective enforces consistent capture and removal of facial information without identity labels or adversarial training. We further present a privacy-aware evaluation framework that jointly assesses detection performance and privacy preservation, and enables analysis of how sensitive information is filtered. Experiments show that embedding privacy constraints into model design reduces sensitive information while maintaining or improving detection accuracy, supporting projection-based architectures as a principled approach for privacy-aware VAD.