Baitradar: A Multi-Model Clickbait Detection Algorithm Using Deep Learning

📅 2021-06-06
🏛️ IEEE International Conference on Acoustics, Speech, and Signal Processing
📈 Citations: 6
Influential: 0
📄 PDF
🤖 AI Summary
To address the prevalent clickbait problem on YouTube—where video titles deliberately misrepresent actual content—this paper proposes a robust multimodal deep learning detection framework. The method integrates six heterogeneous feature modalities: title text, user comments, thumbnail images, tags, video statistics, and audio transcriptions. It introduces, for the first time, a weighted ensemble of six modality-specific models, ensuring stable discrimination even under partial modality dropout. By deeply unifying natural language processing, computer vision, and automatic speech recognition techniques, the approach balances high accuracy with strong generalization. Evaluated on a real-world dataset of 1,400 YouTube videos, the system achieves a mean accuracy of 98% with inference latency ≤2 seconds per video—significantly outperforming state-of-the-art unimodal and mainstream multimodal baselines.

Technology Category

Application Category

📝 Abstract
Following the rising popularity of YouTube, there is an emerging problem on this platform called clickbait, which provokes users to click on videos using attractive titles and thumbnails. As a result, users ended up watching a video that does not have the content as publicized in the title. This issue is addressed in this study by proposing an algorithm called BaitRadar, which uses a deep learning technique where six inference models are jointly consulted to make the final classification decision. These models focus on different attributes of the video, including title, comments, thumbnail, tags, video statistics and audio transcript. The final classification is attained by computing the average of multiple models to provide a robust and accurate output even in situation where there is missing data. The proposed method is tested on 1,400 YouTube videos. On average, a test accuracy of 98% is achieved with an inference time of ≤ 2s.
Problem

Research questions and friction points this paper is trying to address.

Detects clickbait on YouTube using multi-model deep learning
Analyzes video titles, comments, thumbnails, tags, statistics, and audio
Achieves 98% accuracy with under 2s inference time
Innovation

Methods, ideas, or system contributions that make the work stand out.

Multi-model deep learning for clickbait detection
Combines six models analyzing diverse video attributes
Achieves 98% accuracy with under 2s inference time
🔎 Similar Papers
No similar papers found.
Bhanuka Gamage
Bhanuka Gamage
PhD Candidate, Monash University
HCISmart GlassesAccessibilityCerebral Visual ImpairmentHuman Centered AI
A
A. Labib
School of Information Technology, Monash University, 4750 0, Selangor, Malaysia
A
A. Joomun
School of Information Technology, Monash University, 4750 0, Selangor, Malaysia
C
C. H. Lim
School of Information Technology, Monash University, 4750 0, Selangor, Malaysia
K
Koksheik Wong
School of Information Technology, Monash University, 4750 0, Selangor, Malaysia