A Critical Reflection on the Use of Toxicity Detection Algorithms in Proactive Content Moderation Systems

📅 2024-01-19
🏛️ International Journal of Human-Computer Studies
📈 Citations: 2
Influential: 0
📄 PDF
🤖 AI Summary
This paper critically examines the socio-technical risks arising from the shift of toxicity detection algorithms from passive content moderation to proactive interventions—such as real-time keyboard interception. Through participatory workshops with diverse stakeholders and context-sensitive modeling, augmented by socio-technical analysis, we identify three core risks: inequitable distribution of intervention benefits, malicious circumvention and gamification of hate speech, and adversarial contamination of models. We introduce, for the first time, the analytical frameworks of “contextual insensitivity” and “interventional justice” to expose how algorithmic interventions implicitly harm marginalized groups in complex sociolinguistic contexts. Our contribution comprises an actionable ethical review checklist and design constraint principles, grounded in empirical findings and normative reasoning. These resources provide both theoretical foundations and practical guidance for the responsible deployment of AI in proactive content governance. (149 words)

Technology Category

Application Category

📝 Abstract
Toxicity detection algorithms, originally designed with reactive content moderation in mind, are increasingly being deployed into proactive end-user interventions to moderate content. Through a socio-technical lens and focusing on contexts in which they are applied, we explore the use of these algorithms in proactive moderation systems. Placing a toxicity detection algorithm in an imagined virtual mobile keyboard, we critically explore how such algorithms could be used to proactively reduce the sending of toxic content. We present findings from design workshops conducted with four distinct stakeholder groups and find concerns around how contextual complexities may exasperate inequalities around content moderation processes. Whilst only specific user groups are likely to directly benefit from these interventions, we highlight the potential for other groups to misuse them to circumvent detection, validate and gamify hate, and manipulate algorithmic models to exasperate harm.
Problem

Research questions and friction points this paper is trying to address.

Explores toxicity algorithms in proactive content moderation.
Investigates contextual complexities in algorithm application.
Highlights risks of misuse in algorithmic moderation.
Innovation

Methods, ideas, or system contributions that make the work stand out.

Proactive toxicity detection algorithms
Virtual mobile keyboard integration
Stakeholder design workshop insights
🔎 Similar Papers
No similar papers found.
M
Mark Warner
Department of Computer Science, University College London, UK
Angelika Strohmayer
Angelika Strohmayer
School of Design, Northumbria University, UK
M
Matthew Higgs
Independent Researcher, UK
L
Lynne Coventry
Division of Cybersecurity, Abertay University, Scotland, UK