🤖 AI Summary
This study addresses greenwashing in oil-and-gas advertising—i.e., the discrepancy between climate-friendly messaging and actual emission-reduction practices. To systematically analyze such multimodal disinformation, we introduce the first benchmark for vision-language models (VLMs) dedicated to environmental narrative analysis: it comprises 1,200+ ad videos from over 50 institutions across 20 countries, sourced from Facebook and YouTube, and annotated by domain experts for narrative framing types. Unlike prior text-only approaches, our benchmark enables systematic analysis of implicit environmental narratives, cross-cultural contextualization, and variable-length video content. Experiments reveal that GPT-4.1 achieves 79% F1 on basic environmental claim detection but drops to 46% F1 on higher-order frames (e.g., green innovation), exposing critical limitations in VLMs’ deep semantic reasoning. This work establishes a novel data foundation and methodological paradigm for energy communication governance and trustworthy AI evaluation.
📝 Abstract
Companies spend large amounts of money on public relations campaigns to project a positive brand image. However, sometimes there is a mismatch between what they say and what they do. Oil&gas companies, for example, are accused of"greenwashing"with imagery of climate-friendly initiatives. Understanding the framing, and changes in framing, at scale can help better understand the goals and nature of public relations campaigns. To address this, we introduce a benchmark dataset of expert-annotated video ads obtained from Facebook and YouTube. The dataset provides annotations for 13 framing types for more than 50 companies or advocacy groups across 20 countries. Our dataset is especially designed for the evaluation of vision-language models (VLMs), distinguishing it from past text-only framing datasets. Baseline experiments show some promising results, while leaving room for improvement for future work: GPT-4.1 can detect environmental messages with 79% F1 score, while our best model only achieves 46% F1 score on identifying framing around green innovation. We also identify challenges that VLMs must address, such as implicit framing, handling videos of various lengths, or implicit cultural backgrounds. Our dataset contributes to research in multimodal analysis of strategic communication in the energy sector.