Live or Lie: Action-Aware Capsule Multiple Instance Learning for Risk Assessment in Live Streaming Platforms

📅 2026-02-03

📈 Citations: 0

✨ Influential: 0

career value

222K/year

🤖 AI Summary

This work addresses the challenge of accurately detecting sparse and coordinated malicious activities in live-streaming platforms under weak supervision, where only room-level labels are available. The authors formulate risk assessment as a multiple instance learning (MIL) problem and propose the Action-aware Capsule MIL framework (AC-MIL), which introduces structured user-session capsules as instances to jointly model individual behaviors and group coordination patterns. By integrating serial and parallel architectures, AC-MIL enables multi-granular semantic modeling while providing interpretability at the behavior-segment level. Evaluated on a large-scale industrial dataset from TikTok, the proposed method significantly outperforms existing MIL and sequential models, achieving a new state-of-the-art in room-level risk prediction and enabling precise localization of high-risk behavior segments for timely intervention.

Technology Category

Application Category

📝 Abstract

Live streaming has become a cornerstone of today's internet, enabling massive real-time social interactions. However, it faces severe risks arising from sparse, coordinated malicious behaviors among multiple participants, which are often concealed within normal activities and challenging to detect timely and accurately. In this work, we provide a pioneering study on risk assessment in live streaming rooms, characterized by weak supervision where only room-level labels are available. We formulate the task as a Multiple Instance Learning (MIL) problem, treating each room as a bag and defining structured user-timeslot capsules as instances. These capsules represent subsequences of user actions within specific time windows, encapsulating localized behavioral patterns. Based on this formulation, we propose AC-MIL, an Action-aware Capsule MIL framework that models both individual behaviors and group-level coordination patterns. AC-MIL captures multi-granular semantics and behavioral cues through a serial and parallel architecture that jointly encodes temporal dynamics and cross-user dependencies. These signals are integrated for robust room-level risk prediction, while also offering interpretable evidence at the behavior segment level. Extensive experiments on large-scale industrial datasets from Douyin demonstrate that AC-MIL significantly outperforms MIL and sequential baselines, establishing new state-of-the-art performance in room-level risk assessment for live streaming. Moreover, AC-MIL provides capsule-level interpretability, enabling identification of risky behavior segments as actionable evidence for intervention. The project page is available at: https://qiaoyran.github.io/AC-MIL/.

Problem

Research questions and friction points this paper is trying to address.

Live Streaming

Risk Assessment

Multiple Instance Learning

Malicious Behavior

Weak Supervision

Innovation

Methods, ideas, or system contributions that make the work stand out.

Multiple Instance Learning

Action-aware Capsule

Live Streaming Risk Assessment