🤖 AI Summary
This study investigates the implicit influence mechanisms of social short-video platforms (exemplified by TikTok) on value formation among children and adolescents. Grounded in Schwartz’s theory of basic human values, we propose a two-stage multimodal value extraction paradigm—transforming videos into semantic scripts and then into fine-grained value annotations—and construct TikTok-Value, the first manually annotated, adolescent-oriented TikTok value dataset. Experiments demonstrate that fine-tuned masked language models (MLMs) significantly outperform large language models (LLMs) under few-shot inference for value identification; our approach accurately distinguishes explicit value expressions from implicit value contradictions, achieving state-of-the-art performance. Key contributions include: (1) releasing the first open-source, human-annotated TikTok value dataset; (2) empirically validating the script-mediated paradigm for modeling implicit values; and (3) establishing a reproducible methodology for platform-level content value assessment.
📝 Abstract
Societal and personal values are transmitted to younger generations through interaction and exposure. Traditionally, children and adolescents learned values from parents, educators, or peers. Nowadays, social platforms serve as a significant channel through which youth (and adults) consume information, as the main medium of entertainment, and possibly the medium through which they learn different values. In this paper we extract implicit values from TikTok movies uploaded by online influencers targeting children and adolescents. We curated a dataset of hundreds of TikTok movies and annotated them according to the Schwartz Theory of Personal Values. We then experimented with an array of Masked and Large language model, exploring how values can be detected. Specifically, we considered two pipelines -- direct extraction of values from video and a 2-step approach in which videos are first converted to elaborated scripts and then values are extracted. Achieving state-of-the-art results, we find that the 2-step approach performs significantly better than the direct approach and that using a trainable Masked Language Model as a second step significantly outperforms a few-shot application of a number of Large Language Models. We further discuss the impact of fine-tuning and compare the performance of the different models on identification of values present or contradicted in the TikTok. Finally, we share the first values-annotated dataset of TikTok videos. Our results pave the way to further research on influence and value transmission in video-based social platforms.