Extending Information Bottleneck Attribution to Video Sequences

📅 2025-01-28
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address the lack of interpretability in deepfake video detection models, this paper proposes Video Information Bottleneck Attribution (VIBA), the first extension of Information Bottleneck Attribution (IBA) to the spatiotemporal domain. VIBA introduces a gradient-free dual-path attribution framework: it jointly leverages Xception for spatial feature extraction and VGG11 augmented with optical flow modeling to capture temporal dynamics, thereby generating spatial-temporal saliency maps and optical flow attribution maps in a temporally consistent manner. Evaluated on a custom deepfake video dataset, VIBA achieves a 23.7% improvement in Intersection-over-Union (IoU) for localizing forged regions compared to image-level attribution methods, demonstrating strong agreement with human annotations. This work establishes the first IBA-based paradigm tailored for temporal modeling in video-level explainable AI, significantly enhancing the trustworthiness and debuggability of dynamic forgery detection.

Technology Category

Application Category

📝 Abstract
We introduce VIBA, a novel approach for explainable video classification by adapting Information Bottlenecks for Attribution (IBA) to video sequences. While most traditional explainability methods are designed for image models, our IBA framework addresses the need for explainability in temporal models used for video analysis. To demonstrate its effectiveness, we apply VIBA to video deepfake detection, testing it on two architectures: the Xception model for spatial features and a VGG11-based model for capturing motion dynamics through optical flow. Using a custom dataset that reflects recent deepfake generation techniques, we adapt IBA to create relevance and optical flow maps, visually highlighting manipulated regions and motion inconsistencies. Our results show that VIBA generates temporally and spatially consistent explanations, which align closely with human annotations, thus providing interpretability for video classification and particularly for deepfake detection.
Problem

Research questions and friction points this paper is trying to address.

Video Classification
Deep Fakes
Decision Interpretation
Innovation

Methods, ideas, or system contributions that make the work stand out.

Information Bottleneck
Video Understanding
Deepfake Detection
🔎 Similar Papers
No similar papers found.