Counteracting temporal attacks in Video Copy Detection

📅 2025-01-19

📈 Citations: 0

✨ Influential: 0

career value

226K/year

🤖 AI Summary

To address the weak temporal attack robustness and high computational/storage overhead in video copy detection, this paper proposes an adaptive frame selection method based on inter-frame difference local maxima. The method introduces a novel, lightweight, local-maxima-driven sampling mechanism that retains only motion-salient frames, thereby enabling compact feature representation. Compared to state-of-the-art two-level methods, our approach achieves comparable mean average precision (μAP) while reducing feature dimensionality by 56% and accelerating inference speed by over 2×. Overall processing throughput improves by 1.4–5.8× relative to a 1 FPS baseline. By jointly optimizing accuracy, robustness against temporal attacks, and computational efficiency, the proposed method significantly enhances deployment feasibility in resource-constrained environments.

Technology Category

Application Category

📝 Abstract

Video Copy Detection (VCD) plays a crucial role in copyright protection and content verification by identifying duplicates and near-duplicates in large-scale video databases. The META AI Challenge on video copy detection provided a benchmark for evaluating state-of-the-art methods, with the Dual-level detection approach emerging as a winning solution. This method integrates Video Editing Detection and Frame Scene Detection to handle adversarial transformations and large datasets efficiently. However, our analysis reveals significant limitations in the VED component, particularly in its ability to handle exact copies. Moreover, Dual-level detection shows vulnerability to temporal attacks. To address it, we propose an improved frame selection strategy based on local maxima of interframe differences, which enhances robustness against adversarial temporal modifications while significantly reducing computational overhead. Our method achieves an increase of 1.4 to 5.8 times in efficiency over the standard 1 FPS approach. Compared to Dual-level detection method, our approach maintains comparable micro-average precision ($mu$AP) while also demonstrating improved robustness against temporal attacks. Given 56% reduced representation size and the inference time of more than 2 times faster, our approach is more suitable to real-world resource restriction.

Problem

Research questions and friction points this paper is trying to address.

Video Duplication Detection

Computational Efficiency

Storage Optimization

Innovation

Methods, ideas, or system contributions that make the work stand out.

Video Duplication Detection

Temporal Robustness

Resource Efficiency

🔎 Similar Papers

Detecting AI-Generated Video via Frame Consistency

2024-02-03Citations: 1

TikTok

San Jose, California

AI Research Scientist, Video Generation and Post Training, FAIR