Scalable, Energy-Efficient Optical-Neural Architecture for Multiplexed Deepfake Video Detection

📅 2026-05-19

📈 Citations: 0

✨ Influential: 0

career value

228K/year

🤖 AI Summary

This work addresses the computational intensity and low energy efficiency of existing deepfake video detection methods, which hinder large-scale deployment. The authors propose a hybrid digital–optical architecture comprising a lightweight digital frontend for key feature extraction and an optical backend that leverages a programmable spatial light modulator to enable spatially multiplexed, parallel inference. A single optical propagation simultaneously processes over 15 video streams. This approach pioneers the integration of spatially multiplexed optical computing into deepfake detection, achieving 97.79% accuracy on Celeb-DF (with 99.86% sensitivity and 95.72% specificity). It substantially enhances throughput and energy efficiency while demonstrating robustness against compression, noise, misalignment, and black-box adversarial attacks, thereby overcoming the longstanding trade-off between efficiency and robustness inherent in purely digital systems.

📝 Abstract

The rapid proliferation of AI-generated visual media has created an urgent need for efficient, trustworthy deepfake detection systems. However, existing deep learning-based detection methods rely on computationally intensive and energy-demanding inference algorithms, limiting their scalability. Here, we present a hybrid digital-analog deepfake video detection framework that combines a lightweight digital front-end with a spatially multiplexed optical decoding back-end for massively parallel analog inference through a programmable spatial light modulator. By simultaneously processing 15 or more video streams within a single optical propagation pass, the system enables high-throughput and accurate video-level authenticity prediction at reduced computational cost compared with purely digital methods. We validated this hybrid deepfake video processor using different datasets spanning classical face-swapping, real-world deepfake recordings, and fully AI-generated videos. Using a spatially multiplexed experimental set-up operating in the visible spectrum, we achieved average deepfake detection accuracy, sensitivity and specificity of 97.79%, 99.86% and 95.72%, respectively, on the Celeb-DF video dataset with 15 videos tested in parallel in a single optical pass per inference. The multiplexed optical decoder also demonstrates resilience against various types of video degradation, noise, compression, experimental misalignments and black-box adversarial attacks. Our results show that integrating optical computation into AI inference enables simultaneous gains in throughput, energy efficiency, and adversarial robustness - three properties that are difficult to achieve together in purely digital systems.

Problem

Research questions and friction points this paper is trying to address.

deepfake detection

energy efficiency

scalability

video authentication

computational cost

Innovation

Methods, ideas, or system contributions that make the work stand out.

optical neural architecture

spatial multiplexing

deepfake detection