🤖 AI Summary
Detecting coordinated multi-modal (video/audio/text) behavior on short-video platforms (e.g., TikTok) remains challenging due to the failure of conventional text-reuse paradigms and the difficulty in identifying non-exact, semantically aligned coordination.
Method: We propose the first multi-layer heterogeneous network modeling framework tailored for short-video coordination, integrating cross-modal embedding alignment with graph signal processing to jointly measure complex similarities—including audio-visual semantic proximity—while relaxing the link-dependency assumption.
Contribution/Results: Evaluated on real political video data from the TikTok Researcher API, our method successfully uncovers multiple covert coordination campaigns. It systematically reveals novel coordination tactics—such as audio remixing and template-based editing—and identifies critical blind spots in existing detection systems. The framework significantly improves both accuracy and interpretability of multi-modal coordinated behavior detection.
📝 Abstract
Research on online coordinated behaviour has predominantly focused on text-based social media platforms, where coordination manifests clearly through the frequent posting of identical hyperlinks or the frequent re-sharing of the same textual content by the same group of users. However, the rise of short-video platforms like TikTok introduces distinct challenges, by supporting integrated multimodality within posts and complex similarity between them. In this paper, we propose an approach to detecting coordination that addresses these characteristic challenges. Our methodology, based on multilayer network analysis, is tailored to capture coordination across multiple modalities, including video, audio, and text, and explicitly handles complex forms of similarity inherent in video and audio content. We test this approach on political videos posted on TikTok and extracted via the TikTok researcher API. This application demonstrates the capacity of the approach to identify coordination, while also critically highlighting potential pitfalls and limitations.