Exposing and Mitigating Temporal Attack in Deepfake Video Detection

📅 2026-05-08

📈 Citations: 0

✨ Influential: 0

career value

204K/year

🤖 AI Summary

Existing spatiotemporal deepfake detection methods are vulnerable to temporal spectral attacks, as they overly rely on fragile spectral cues while neglecting robust semantic causal relationships. To address this limitation, this work proposes SpInShield, a novel defense framework that explicitly decouples semantic motion from temporal spectral artifacts for the first time. SpInShield incorporates a learnable spectral adversarial module that dynamically generates extreme perturbations, combined with a shortcut-suppression optimization strategy to compel the model to focus on reliable forensic features while eliminating unstable spectral statistics in the latent space. Evaluated on standard benchmarks, the proposed method achieves state-of-the-art performance, improving AUC by 21.30 percentage points over the strongest baseline under simulated magnitude-spectrum attacks, thereby significantly enhancing robustness against spectral-based adversarial manipulations.

📝 Abstract

While spatiotemporal deepfake detectors achieve high AUC, our experiments reveal their susceptibility to evasion attacks. These models tend to overfit on fragile temporal spectrum cues, rather than learning robust semantic causality. To mitigate this vulnerability, we propose SpInShield, a temporal spectral-invariant defense framework explicitly designed to decouple semantic motion from manipulatable spectral artifacts. We propose a learnable spectral adversary that dynamically synthesizes severe spectral deformations, simulating extreme attack scenarios. By employing a shortcut suppression optimization strategy, SpInShield compels the encoder to extract reliable forensic cues while purging unstable spectral statistics from the latent space. Experiments show that SpInShield obtains competitive performance on widely used datasets and outperforms the strongest baseline by 21.30 percentage points in AUC under simulated amplitude spectral attacks.

Problem

Research questions and friction points this paper is trying to address.

deepfake video detection

temporal attack

adversarial evasion

spectral artifacts

model robustness

Innovation

Methods, ideas, or system contributions that make the work stand out.

temporal attack

spectral invariance

deepfake detection