🤖 AI Summary
This study evaluates the causal effects of large-scale deplatforming interventions, exemplified by Reddit’s “Great Ban.” Leveraging longitudinal log data from 34,000 users and 53 million comments, it applies a difference-in-differences (DID) framework—the first such use for quantifying heterogeneous treatment effects in deplatforming. Results reveal: (1) 15.6% of banned users permanently disengage; (2) among retained users, aggregate comment toxicity declines by 4.1%, yet a distinct subgroup exhibits a 70% *increase* in toxicity; (3) this high-toxicity subgroup shows no significant rise in activity or engagement, challenging the common hypothesis that deplatforming intensifies extremist behavior. The paper’s contributions are threefold: it pioneers DID-based estimation of differential deplatforming effects; uncovers non-monotonic toxicity responses; and demonstrates substantial individual-level heterogeneity—refuting uniform-effect assumptions. These findings provide granular, causal evidence to inform platform content governance policies.
📝 Abstract
In today's online environments, users encounter harm and abuse on a daily basis. Therefore, content moderation is crucial to ensure their safety and well-being. However, the effectiveness of many moderation interventions is still uncertain. Here, we apply a causal inference approach to shed light on the effectiveness of The Great Ban, a massive social media deplatforming intervention. We analyze 53M comments shared by nearly 34K users, providing in-depth results on both the intended and unintended consequences of the ban. Our causal analyses reveal that 15.6% of the moderated users abandoned the platform while the remaining ones decreased their overall toxicity by 4.1%. Nonetheless, a subset of those users increased their toxicity by 70% after the intervention. However, the increases in toxicity did not lead to marked increases in activity or engagement, meaning that the most toxic users had an overall limited impact. Our findings bring to light new insights on the effectiveness of deplatforming moderation interventions. Furthermore, they also contribute to informing future content moderation strategies.