SPDMark: Selective Parameter Displacement for Robust Video Watermarking

📅 2025-12-12
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address the challenge of simultaneously achieving imperceptibility, robustness, and efficiency in generative video provenance, this paper proposes a novel paradigm for embedding robust watermarks directly into the generation process of video diffusion models. Methodologically, we introduce an inter-layer basis vector displacement encoding scheme to enable indexable watermark embedding; integrate LoRA adapters with cryptographic hashing for frame-level precise localization; and impose temporal consistency constraints alongside joint optimization—balancing message recovery, perceptual similarity (LPIPS), and temporal coherence—to support recovery from frame-order tampering. Evaluated on both text-to-video and image-to-video diffusion models, our approach achieves watermark recovery accuracy exceeding 98% and LPIPS < 0.03, while demonstrating strong robustness against common attacks including compression, cropping, and frame insertion/deletion.

Technology Category

Application Category

📝 Abstract
The advent of high-quality video generation models has amplified the need for robust watermarking schemes that can be used to reliably detect and track the provenance of generated videos. Existing video watermarking methods based on both post-hoc and in-generation approaches fail to simultaneously achieve imperceptibility, robustness, and computational efficiency. This work introduces a novel framework for in-generation video watermarking called SPDMark (pronounced `SpeedMark') based on selective parameter displacement of a video diffusion model. Watermarks are embedded into the generated videos by modifying a subset of parameters in the generative model. To make the problem tractable, the displacement is modeled as an additive composition of layer-wise basis shifts, where the final composition is indexed by the watermarking key. For parameter efficiency, this work specifically leverages low-rank adaptation (LoRA) to implement the basis shifts. During the training phase, the basis shifts and the watermark extractor are jointly learned by minimizing a combination of message recovery, perceptual similarity, and temporal consistency losses. To detect and localize temporal modifications in the watermarked videos, we use a cryptographic hashing function to derive frame-specific watermark messages from the given base watermarking key. During watermark extraction, maximum bipartite matching is applied to recover the correct frame order, even from temporally tampered videos. Evaluations on both text-to-video and image-to-video generation models demonstrate the ability of SPDMark to generate imperceptible watermarks that can be recovered with high accuracy and also establish its robustness against a variety of common video modifications.
Problem

Research questions and friction points this paper is trying to address.

Robust watermarking for generated video provenance tracking
Simultaneously achieving imperceptibility, robustness, and computational efficiency
Detecting and localizing temporal modifications in watermarked videos
Innovation

Methods, ideas, or system contributions that make the work stand out.

Selective parameter displacement in video diffusion models
Low-rank adaptation for efficient watermark embedding
Cryptographic hashing with bipartite matching for tamper detection
🔎 Similar Papers
No similar papers found.