๐ค AI Summary
This work addresses the challenge of deploying deepfake detection models on edge devices, where existing methods suffer from high computational cost and excessive parameter counts, hindering real-time inference. Moreover, conventional quantization approaches often degrade performance by inadvertently discarding critical forgery cues. To overcome these limitations, this study proposes the first task-specific quantization framework tailored for deepfake detectionโnamely, an adaptive bidirectional compression scheme that jointly models feature correlations and prunes redundant components, thereby preserving discriminative details while significantly reducing model size. The approach is compatible with various mainstream backbone architectures and optimized for edge deployment. Extensive experiments across five benchmark datasets and eleven state-of-the-art detectors demonstrate consistent superiority over existing compression techniques, enabling high-accuracy real-time deepfake detection on mobile platforms.
๐ Abstract
Deepfake detection has become a fundamental component of modern media forensics. Despite significant progress in detection accuracy, most existing methods remain computationally intensive and parameter-heavy, limiting their deployment on resource-constrained edge devices that require real-time, on-site inference. This limitation is particularly critical in an era where mobile devices are extensively used for media-centric applications, including online payments, virtual meetings, and social networking. Meanwhile, due to the unique requirement of capturing extremely subtle forgery artifacts for deepfake detection, state-of-the-art quantization techniques usually underperform for such a challenging task. These fine-grained cues are highly sensitive to model compression and can be easily degraded during quantization, leading to noticeable performance drops. This challenge highlights the need for quantization strategies specifically designed to preserve the discriminative features essential for reliable deepfake detection. To address this gap, we propose DefakeQ, the first quantization framework tailored for deepfake detectors, enabling real-time deployment on edge devices. Our approach introduces a novel adaptive bidirectional compression strategy that simultaneously leverages feature correlations and eliminates redundancy, achieving an effective balance between model compactness and detection performance. Extensive experiments across five benchmark datasets and eleven state-of-the-art backbone detectors demonstrate that DeFakeQ consistently surpasses existing quantization and model compression baselines. Furthermore, we deploy DefakeQ on mobile devices in real-world scenarios, demonstrating its capability for real-time deepfake detection and its practical applicability in edge environments.