Beyond Trial-and-Error: Predicting User Abandonment After a Moderation Intervention

📅 2024-04-23
🏛️ arXiv.org
📈 Citations: 2
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the inherent limitations of conventional content moderation—namely, post-hoc evaluation and reactive responses—by formally proposing and modeling a novel task: *intervention-induced user attrition prediction*, which forecasts users’ likelihood of abandoning the platform *prior* to moderator interventions. Leveraging 142 subreddit bans and 13.8 million user behavioral logs from Reddit, we construct a 142-dimensional feature set capturing activity patterns, social connectivity, content toxicity, and linguistic style. Our method integrates XGBoost with BERT-based text encoding for binary classification. Experiments reveal that activity-related features exhibit the highest discriminative power; the best-performing model achieves a micro-F1 score of 0.914 and demonstrates robust cross-community generalization. This study pioneers the “predictive content moderation” paradigm, delivering a deployable tool for pre-intervention impact assessment—thereby substantially mitigating unintended user attrition and collateral enforcement errors.

Technology Category

Application Category

📝 Abstract
Current content moderation follows a reactive, trial-and-error approach, where interventions are applied and their effects are only measured post-hoc. In contrast, we introduce a proactive, predictive approach that enables moderators to anticipate the impact of their actions before implementation. We propose and tackle the new task of predicting user abandonment following a moderation intervention. We study the reactions of 16,540 users to a massive ban of online communities on Reddit, training a set of binary classifiers to identify those users who would abandon the platform after the intervention -- a problem of great practical relevance. We leverage a dataset of 13.8 million posts to compute a large and diverse set of 142 features, which convey information about the activity, toxicity, relations, and writing style of the users. We obtain promising results, with the best-performing model achieving micro F1-score = 0.914. Our model shows robust generalizability when applied to users from previously unseen communities. Furthermore, we identify activity features as the most informative predictors, followed by relational and toxicity features, while writing style features exhibit limited utility. Theoretically, our results demonstrate the feasibility of adopting a predictive machine learning approach to estimate the effects of moderation interventions. Practically, this work marks a fundamental shift from reactive to predictive moderation, equipping platform administrators with intelligent tools to strategically plan interventions, minimize unintended consequences, and optimize user engagement.
Problem

Research questions and friction points this paper is trying to address.

Predict user abandonment after moderation interventions
Shift from reactive to predictive content moderation
Identify key features influencing user retention post-intervention
Innovation

Methods, ideas, or system contributions that make the work stand out.

Predictive machine learning for moderation impact
Binary classifiers to forecast user abandonment
Activity, toxicity, relational features predict outcomes
🔎 Similar Papers
No similar papers found.
B
Benedetta Tessa
Institute of Informatics and Telematics (IIT) of the National Research Council (CNR), Pisa, Italy
L
Lorenzo Cima
Institute of Informatics and Telematics (IIT) of the National Research Council (CNR), Pisa, Italy
Amaury Trujillo
Amaury Trujillo
Researcher, IIT-CNR
Social ComputingHuman-Computer InteractionWeb Technologies
M
M. Avvenuti
University of Pisa, Department of Information Engineering, Pisa, Italy
S
S. Cresci
Institute of Informatics and Telematics (IIT) of the National Research Council (CNR), Pisa, Italy