π€ AI Summary
Automated video classification for MPAA age ratings (G/PG/PG-13/R) faces challenges including heavy reliance on labeled data, difficulty distinguishing boundary classes (e.g., PG-13 vs. R), and poor generalization. To address these, we propose a hybrid model integrating contextual contrastive learning with Bahdanau attention. Built upon the LRCN architecture, it jointly optimizes NT-Xent, NT-logistic, and margin triplet loss functions to enhance discriminative representation learning. Bahdanau attention dynamically weights key frames, improving fine-grained interpretability. Evaluated on a standard benchmark, our method achieves 88.0% accuracy and an F1 score of 0.8815βsetting a new state-of-the-art. The model has been deployed as a real-time web service for content compliance review in streaming platforms.
π Abstract
The rapid growth of visual content consumption across platforms necessitates automated video classification for age-suitability standards like the MPAA rating system (G, PG, PG-13, R). Traditional methods struggle with large labeled data requirements, poor generalization, and inefficient feature learning. To address these challenges, we employ contrastive learning for improved discrimination and adaptability, exploring three frameworks: Instance Discrimination, Contextual Contrastive Learning, and Multi-View Contrastive Learning. Our hybrid architecture integrates an LRCN (CNN+LSTM) backbone with a Bahdanau attention mechanism, achieving state-of-the-art performance in the Contextual Contrastive Learning framework, with 88% accuracy and an F1 score of 0.8815. By combining CNNs for spatial features, LSTMs for temporal modeling, and attention mechanisms for dynamic frame prioritization, the model excels in fine-grained borderline distinctions, such as differentiating PG-13 and R-rated content. We evaluate the model's performance across various contrastive loss functions, including NT-Xent, NT-logistic, and Margin Triplet, demonstrating the robustness of our proposed architecture. To ensure practical application, the model is deployed as a web application for real-time MPAA rating classification, offering an efficient solution for automated content compliance across streaming platforms.