Seeing, Hearing, and Knowing Together: Multimodal Strategies in Deepfake Videos Detection

📅 2026-02-01

📈 Citations: 0

✨ Influential: 0

career value

239K/year

🤖 AI Summary

This study addresses the growing challenge of detecting increasingly realistic deepfake videos, where human perceptual capabilities are limited and media literacy interventions are urgently needed. Through an experiment involving 195 participants who judged real versus fake videos, provided confidence ratings, and reported their decision cues, we systematically uncover—via behavioral analysis, confidence calibration, and association rule mining—the multimodal interplay of visual, auditory, and intuitive signals in detection strategies. Our findings reveal that successful identification often hinges on combinations of appearance, voice, and gut feeling, with participants demonstrating higher accuracy and better-calibrated confidence for real videos. The study identifies specific cues that either facilitate or hinder detection, offering empirical foundations for targeted media literacy training.

Technology Category

Application Category

📝 Abstract

As deepfake videos become increasingly difficult for people to recognise, understanding the strategies humans use is key to designing effective media literacy interventions. We conducted a study with 195 participants between the ages of 21 and 40, who judged real and deepfake videos, rated their confidence, and reported the cues they relied on across visual, audio, and knowledge strategies. Participants were more accurate with real videos than with deepfakes and showed lower expected calibration error for real content. Through association rule mining, we identified cue combinations that shaped performance. Visual appearance, vocal, and intuition often co-occurred for successful identifications, which highlights the importance of multimodal approaches in human detection. Our findings show which cues help or hinder detection and suggest directions for designing media literacy tools that guide effective cue use. Building on these insights can help people improve their identification skills and become more resilient to deceptive digital media.

Problem

Research questions and friction points this paper is trying to address.

deepfake detection

multimodal strategies

media literacy

human perception

audiovisual cues

Innovation

Methods, ideas, or system contributions that make the work stand out.

multimodal strategies

deepfake detection

association rule mining