๐ค AI Summary
Existing approaches predominantly rely on textual signals and struggle to effectively capture the multimodal characteristics of pro-eating disorder content in short videos or its rapidly evolving cultural context. To address this gap, this work introduces the concept of โzeitgeist-awareโ detection and presents ZAM, the first dynamically evolving, expert-annotated multimodal dataset for pro-eating disorder content. ZAM integrates multimodal analysis, expert annotation, adaptive inclusion criteria, and continuous data collection to enable real-time detection and research of short-form harmful content. By doing so, it fills a critical void in dynamic multimodal benchmarking and provides a scalable foundation for interdisciplinary research and responsive content moderation systems.
๐ Abstract
Objective: Reliable identification of pro-eating disorder (pro-ED) content online suffers from two pervasive problems: 1) existing methods predominantly rely on text-based signals, failing to capture the inherently multimodal nature of multimedia content; and 2) these methods struggle to keep pace with the rapid evolution of references, memes, terminology, and contextual cues that underlie this content. Together, these limitations point to a gap: the absence of an expert-annotated reference standard capable of supporting real-time research and robust multimodal detection model training for pro-ED content on short-form video platforms. Method: To address this, we propose "zeitgeist-aware" multimodal (ZAM) datasets: continuously curated collections of annotated multimodal pro-ED content with inclusion criteria that evolve alongside the memetic zeitgeist: the variable essence of what is considered pro-ED as new media and references come into the cultural zeitgeist and are absorbed and interpreted in online spaces. Results: We present a rationale for such datasets, define their core characteristics, outline approaches for their curation, and describe our progress toward that end. Discussion: This dataset and pipeline architecture may benefit researchers across several fields who are interested in how pro-ED sentiment is encoded and transmitted through short-form video content across time, including for the purpose of responsive moderation efforts.