SPDMark: Selective Parameter Displacement for Robust Video Watermarking

📅 2025-12-12

📈 Citations: 0

✨ Influential: 0

career value

225K/year

🤖 AI Summary

To address the challenge of simultaneously achieving imperceptibility, robustness, and efficiency in generative video provenance, this paper proposes a novel paradigm for embedding robust watermarks directly into the generation process of video diffusion models. Methodologically, we introduce an inter-layer basis vector displacement encoding scheme to enable indexable watermark embedding; integrate LoRA adapters with cryptographic hashing for frame-level precise localization; and impose temporal consistency constraints alongside joint optimization—balancing message recovery, perceptual similarity (LPIPS), and temporal coherence—to support recovery from frame-order tampering. Evaluated on both text-to-video and image-to-video diffusion models, our approach achieves watermark recovery accuracy exceeding 98% and LPIPS < 0.03, while demonstrating strong robustness against common attacks including compression, cropping, and frame insertion/deletion.

Technology Category

Application Category

📝 Abstract

The advent of high-quality video generation models has amplified the need for robust watermarking schemes that can be used to reliably detect and track the provenance of generated videos. Existing video watermarking methods based on both post-hoc and in-generation approaches fail to simultaneously achieve imperceptibility, robustness, and computational efficiency. This work introduces a novel framework for in-generation video watermarking called SPDMark (pronounced `SpeedMark') based on selective parameter displacement of a video diffusion model. Watermarks are embedded into the generated videos by modifying a subset of parameters in the generative model. To make the problem tractable, the displacement is modeled as an additive composition of layer-wise basis shifts, where the final composition is indexed by the watermarking key. For parameter efficiency, this work specifically leverages low-rank adaptation (LoRA) to implement the basis shifts. During the training phase, the basis shifts and the watermark extractor are jointly learned by minimizing a combination of message recovery, perceptual similarity, and temporal consistency losses. To detect and localize temporal modifications in the watermarked videos, we use a cryptographic hashing function to derive frame-specific watermark messages from the given base watermarking key. During watermark extraction, maximum bipartite matching is applied to recover the correct frame order, even from temporally tampered videos. Evaluations on both text-to-video and image-to-video generation models demonstrate the ability of SPDMark to generate imperceptible watermarks that can be recovered with high accuracy and also establish its robustness against a variety of common video modifications.

Problem

Research questions and friction points this paper is trying to address.

Robust watermarking for generated video provenance tracking

Simultaneously achieving imperceptibility, robustness, and computational efficiency

Detecting and localizing temporal modifications in watermarked videos

Innovation

Methods, ideas, or system contributions that make the work stand out.

Selective parameter displacement in video diffusion models

Low-rank adaptation for efficient watermark embedding

Cryptographic hashing with bipartite matching for tamper detection

🔎 Similar Papers

No similar papers found.

TikTok

San Jose, California

Senior AI Engineer, World Foundation Models

Nvidia

The base salary range is 184,000 USD - 287,500 USD for Level 4, and 224,000 USD - 356,500 USD for Level 5. You will also be eligible for equity and benefits.

US, CA, Remote / US, WA, Remote / US, OR, Remote

AI Research Scientist, Computer Vision - Facebook Video Intelligence