Detecting Coordination on Short-Video Platforms: The Challenge of Multimodality and Complex Similarity on TikTok

📅 2025-06-06

📈 Citations: 0

✨ Influential: 0

career value

217K/year

🤖 AI Summary

Detecting coordinated multi-modal (video/audio/text) behavior on short-video platforms (e.g., TikTok) remains challenging due to the failure of conventional text-reuse paradigms and the difficulty in identifying non-exact, semantically aligned coordination. Method: We propose the first multi-layer heterogeneous network modeling framework tailored for short-video coordination, integrating cross-modal embedding alignment with graph signal processing to jointly measure complex similarities—including audio-visual semantic proximity—while relaxing the link-dependency assumption. Contribution/Results: Evaluated on real political video data from the TikTok Researcher API, our method successfully uncovers multiple covert coordination campaigns. It systematically reveals novel coordination tactics—such as audio remixing and template-based editing—and identifies critical blind spots in existing detection systems. The framework significantly improves both accuracy and interpretability of multi-modal coordinated behavior detection.

Technology Category

Application Category

📝 Abstract

Research on online coordinated behaviour has predominantly focused on text-based social media platforms, where coordination manifests clearly through the frequent posting of identical hyperlinks or the frequent re-sharing of the same textual content by the same group of users. However, the rise of short-video platforms like TikTok introduces distinct challenges, by supporting integrated multimodality within posts and complex similarity between them. In this paper, we propose an approach to detecting coordination that addresses these characteristic challenges. Our methodology, based on multilayer network analysis, is tailored to capture coordination across multiple modalities, including video, audio, and text, and explicitly handles complex forms of similarity inherent in video and audio content. We test this approach on political videos posted on TikTok and extracted via the TikTok researcher API. This application demonstrates the capacity of the approach to identify coordination, while also critically highlighting potential pitfalls and limitations.

Problem

Research questions and friction points this paper is trying to address.

Detecting coordinated behavior on multimodal short-video platforms

Addressing complex similarity in video and audio content

Identifying coordination across video, audio, and text modalities

Innovation

Methods, ideas, or system contributions that make the work stand out.

Multilayer network analysis for multimodal coordination

Handles complex similarity in video and audio

Tailored for TikTok's integrated multimodality challenges

🔎 Similar Papers

Conspiracy theories and where to find them on TikTok