StegaVAR: Privacy-Preserving Video Action Recognition via Steganographic Domain Analysis

📅 2025-12-14
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Existing video action recognition (VAR) methods addressing privacy leakage suffer from two critical limitations: perceptible visual distortion (low steganographic imperceptibility) and degradation of spatiotemporal features (harming recognition accuracy). This paper proposes, for the first time, an end-to-end steganographic-domain VAR paradigm: action videos are invisibly embedded into natural cover videos, preserving cover visual fidelity while fully retaining action-relevant spatiotemporal information. We introduce the Secret Spatio-Temporal Promotion (STeP) mechanism to guide secret feature extraction and Cross-Band Difference Attention (CroDA) to suppress cover-induced interference. The framework jointly integrates steganographic encoding, spatiotemporal disentanglement, and a trainable steganalytic network. Evaluated on mainstream benchmarks, our method achieves significantly higher VAR accuracy than conventional anonymization approaches, attains >99.2% resistance against steganalysis detection, and maintains compatibility with diverse steganographic models.

Technology Category

Application Category

📝 Abstract
Despite the rapid progress of deep learning in video action recognition (VAR) in recent years, privacy leakage in videos remains a critical concern. Current state-of-the-art privacy-preserving methods often rely on anonymization. These methods suffer from (1) low concealment, where producing visually distorted videos that attract attackers' attention during transmission, and (2) spatiotemporal disruption, where degrading essential spatiotemporal features for accurate VAR. To address these issues, we propose StegaVAR, a novel framework that embeds action videos into ordinary cover videos and directly performs VAR in the steganographic domain for the first time. Throughout both data transmission and action analysis, the spatiotemporal information of hidden secret video remains complete, while the natural appearance of cover videos ensures the concealment of transmission. Considering the difficulty of steganographic domain analysis, we propose Secret Spatio-Temporal Promotion (STeP) and Cross-Band Difference Attention (CroDA) for analysis within the steganographic domain. STeP uses the secret video to guide spatiotemporal feature extraction in the steganographic domain during training. CroDA suppresses cover interference by capturing cross-band semantic differences. Experiments demonstrate that StegaVAR achieves superior VAR and privacy-preserving performance on widely used datasets. Moreover, our framework is effective for multiple steganographic models.
Problem

Research questions and friction points this paper is trying to address.

Privacy leakage in videos during action recognition is a critical concern.
Current methods suffer from low concealment and spatiotemporal disruption.
The paper proposes a framework to embed and analyze videos in the steganographic domain.
Innovation

Methods, ideas, or system contributions that make the work stand out.

Embeds action videos into ordinary cover videos
Performs recognition directly in steganographic domain
Uses STeP and CroDA for domain-specific feature extraction
🔎 Similar Papers
No similar papers found.
L
Lixin Chen
School of Computing and Information Technology, Great Bay University
C
Chaomeng Chen
Tsinghua Shenzhen International Graduate School, Tsinghua University
Jiale Zhou
Jiale Zhou
MDH
requirements engineeringsafety critical systemshazard analysisontology
Z
Zhijian Wu
School of Engineering, Westlake University
Xun Lin
Xun Lin
Postdoc, CUHK; PhD, Beihang University
Subtle Visual ComputingMedia Security