StegaVAR: Privacy-Preserving Video Action Recognition via Steganographic Domain Analysis

📅 2025-12-14

📈 Citations: 0

✨ Influential: 0

career value

216K/year

🤖 AI Summary

Existing video action recognition (VAR) methods addressing privacy leakage suffer from two critical limitations: perceptible visual distortion (low steganographic imperceptibility) and degradation of spatiotemporal features (harming recognition accuracy). This paper proposes, for the first time, an end-to-end steganographic-domain VAR paradigm: action videos are invisibly embedded into natural cover videos, preserving cover visual fidelity while fully retaining action-relevant spatiotemporal information. We introduce the Secret Spatio-Temporal Promotion (STeP) mechanism to guide secret feature extraction and Cross-Band Difference Attention (CroDA) to suppress cover-induced interference. The framework jointly integrates steganographic encoding, spatiotemporal disentanglement, and a trainable steganalytic network. Evaluated on mainstream benchmarks, our method achieves significantly higher VAR accuracy than conventional anonymization approaches, attains >99.2% resistance against steganalysis detection, and maintains compatibility with diverse steganographic models.

Technology Category

Application Category

📝 Abstract

Despite the rapid progress of deep learning in video action recognition (VAR) in recent years, privacy leakage in videos remains a critical concern. Current state-of-the-art privacy-preserving methods often rely on anonymization. These methods suffer from (1) low concealment, where producing visually distorted videos that attract attackers' attention during transmission, and (2) spatiotemporal disruption, where degrading essential spatiotemporal features for accurate VAR. To address these issues, we propose StegaVAR, a novel framework that embeds action videos into ordinary cover videos and directly performs VAR in the steganographic domain for the first time. Throughout both data transmission and action analysis, the spatiotemporal information of hidden secret video remains complete, while the natural appearance of cover videos ensures the concealment of transmission. Considering the difficulty of steganographic domain analysis, we propose Secret Spatio-Temporal Promotion (STeP) and Cross-Band Difference Attention (CroDA) for analysis within the steganographic domain. STeP uses the secret video to guide spatiotemporal feature extraction in the steganographic domain during training. CroDA suppresses cover interference by capturing cross-band semantic differences. Experiments demonstrate that StegaVAR achieves superior VAR and privacy-preserving performance on widely used datasets. Moreover, our framework is effective for multiple steganographic models.

Problem

Research questions and friction points this paper is trying to address.

Privacy leakage in videos during action recognition is a critical concern.

Current methods suffer from low concealment and spatiotemporal disruption.

The paper proposes a framework to embed and analyze videos in the steganographic domain.

Innovation

Methods, ideas, or system contributions that make the work stand out.

Embeds action videos into ordinary cover videos

Performs recognition directly in steganographic domain

Uses STeP and CroDA for domain-specific feature extraction

🔎 Similar Papers

No similar papers found.