🤖 AI Summary
This study addresses the challenge of efficiently and accurately identifying prominent political candidates and counting individuals in Instagram visual content during the 2021 German federal election. It pioneers the integration of the multimodal large language model GPT-4o into political visual communication analysis, complemented by established computer vision techniques including FaceNet512, RetinaFace, and Google Cloud Vision. Experimental results demonstrate that GPT-4o achieves a macro F1 score of 0.89 for face recognition and 0.86 for person counting on Instagram Stories, substantially outperforming existing approaches. These findings underscore the innovative potential and superior performance of multimodal large models in analyzing political imagery, offering a significant advancement for computational methods in political communication research.
📝 Abstract
This paper presents a computational case study that evaluates the capabilities of specialized machine learning models and emerging multimodal large language models for Visual Political Communication (VPC) analysis. Focusing on concentrated visibility in Instagram stories and posts during the 2021 German federal election campaign, we compare the performance of traditional computer vision models (FaceNet512, RetinaFace, Google Cloud Vision) with a multimodal large language model (GPT-4o) in identifying front-runner politicians and counting individuals in images. GPT-4o outperformed the other models, achieving a macro F1-score of 0.89 for face recognition and 0.86 for person counting in stories. These findings demonstrate the potential of advanced AI systems to scale and refine visual content analysis in political communication while highlighting methodological considerations for future research.