Investigating the heterogenous effects of a massive content moderation intervention via Difference-in-Differences

📅 2024-11-06

🏛️ arXiv.org

📈 Citations: 2

✨ Influential: 0

career value

198K/year

🤖 AI Summary

This study evaluates the causal effects of large-scale deplatforming interventions, exemplified by Reddit’s “Great Ban.” Leveraging longitudinal log data from 34,000 users and 53 million comments, it applies a difference-in-differences (DID) framework—the first such use for quantifying heterogeneous treatment effects in deplatforming. Results reveal: (1) 15.6% of banned users permanently disengage; (2) among retained users, aggregate comment toxicity declines by 4.1%, yet a distinct subgroup exhibits a 70% *increase* in toxicity; (3) this high-toxicity subgroup shows no significant rise in activity or engagement, challenging the common hypothesis that deplatforming intensifies extremist behavior. The paper’s contributions are threefold: it pioneers DID-based estimation of differential deplatforming effects; uncovers non-monotonic toxicity responses; and demonstrates substantial individual-level heterogeneity—refuting uniform-effect assumptions. These findings provide granular, causal evidence to inform platform content governance policies.

Technology Category

Application Category

📝 Abstract

In today's online environments, users encounter harm and abuse on a daily basis. Therefore, content moderation is crucial to ensure their safety and well-being. However, the effectiveness of many moderation interventions is still uncertain. Here, we apply a causal inference approach to shed light on the effectiveness of The Great Ban, a massive social media deplatforming intervention. We analyze 53M comments shared by nearly 34K users, providing in-depth results on both the intended and unintended consequences of the ban. Our causal analyses reveal that 15.6% of the moderated users abandoned the platform while the remaining ones decreased their overall toxicity by 4.1%. Nonetheless, a subset of those users increased their toxicity by 70% after the intervention. However, the increases in toxicity did not lead to marked increases in activity or engagement, meaning that the most toxic users had an overall limited impact. Our findings bring to light new insights on the effectiveness of deplatforming moderation interventions. Furthermore, they also contribute to informing future content moderation strategies.

Problem

Research questions and friction points this paper is trying to address.

Evaluates effectiveness of large-scale social media deplatforming intervention

Measures intended and unintended impacts on user toxicity and retention

Assesses behavioral changes among moderated versus unaffected users

Innovation

Methods, ideas, or system contributions that make the work stand out.

Causal inference approach for effectiveness analysis

Analyzed 53M comments from 34K users

Measured toxicity changes post-intervention

🔎 Similar Papers

The Great Ban: Efficacy and Unintended Consequences of a Massive Deplatforming Operation on Reddit