Social bias is prevalent in user reports of hate and abuse online

📅 2025-10-06
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This study identifies a systemic social bias in online hate-content reporting mechanisms: users are significantly more likely to report abusive content targeting their own social group (in-group) than identical content targeting out-groups, thereby undermining platform-level harm mitigation. Through five high-statistical-power, pre-registered online experiments spanning four sensitive domains—political orientation, vaccine attitudes, climate change beliefs, and abortion rights—we find that approximately 50% of abusive comments are reported, with a robust and statistically significant in-group reporting bias across all contexts (mean effect size *d* = 0.38). This work provides the first multi-domain, ecologically valid empirical demonstration of identity-dependent reporting behavior. It establishes critical behavioral evidence and theoretical grounding for redesigning fairer, more robust content moderation systems that account for sociopsychological reporting biases.

Technology Category

Application Category

📝 Abstract
The prevalence of online hate and abuse is a pressing global concern. While tackling such societal harms is a priority for research across the social sciences, it is a difficult task, in part because of the magnitude of the problem. User engagement with reporting mechanisms (flagging) online is an increasingly important part of monitoring and addressing harmful content at scale. However, users may not flag content routinely enough, and when they do engage, they may be biased by group identity and political beliefs. Across five well-powered and pre-registered online experiments, we examine the extent of social bias in the flagging of hate and abuse in four different intergroup contexts: political affiliation, vaccination opinions, beliefs about climate change, and stance on abortion rights. Overall, participants reported abuse reliably, with approximately half of the abusive comments in each study reported. However, a pervasive social bias was present whereby ingroup-directed abuse was consistently flagged to a greater extent than outgroup-directed abuse. Our findings offer new insights into the nature of user flagging online, an understanding of which is crucial for enhancing user intervention against online hate speech and thus ensuring a safer online environment.
Problem

Research questions and friction points this paper is trying to address.

Examining social bias in online hate reporting mechanisms
Investigating ingroup favoritism in flagging abusive content
Addressing biased user engagement with abuse reporting systems
Innovation

Methods, ideas, or system contributions that make the work stand out.

Examining social bias in user flagging across intergroup contexts
Finding ingroup abuse flagged more than outgroup abuse
Providing insights to enhance user intervention against hate speech
F
Florence E. Enock
Public Policy Programme, The Alan Turing Institute, The British Library, 96 Euston Road, London. NW1 2DB
H
Helen Z. Margetts
Oxford Internet Institute, University of Oxford, Stephen A. Schwarzman Centre for the Humanities, Radcliffe Observatory Quarter, Oxford. OX2 6GG; Data Science Institute, The London School of Economics and Political Science, Houghton Street, London. WC2A 2AE
Jonathan Bright
Jonathan Bright
CTO at pattrn.ai
AIAI safetyOnline safetyAI for government