🤖 AI Summary
This study addresses a critical gap in understanding how anti-hate speech chatbots influence bystander behavior in online communities. We developed Civilbot, a natural language generation–based chatbot, and employed a mixed-methods within-subjects experimental design to propose a framework of bystander-oriented counter-speech strategies. Our findings indicate that cognitive-focused strategies paired with a positive tone are most effective, with intervention outcomes highly contingent on contextual appropriateness. When Civilbot performs effectively, it can either guide or substitute for human bystander intervention; however, poor performance may suppress bystanders’ willingness to act. This work provides both theoretical insights and practical guidance for AI-driven community moderation and governance.
📝 Abstract
Counterspeech offers a non-repressive approach to moderate hate speech in online communities. Research has examined how counterspeech chatbots restrain hate speakers and support targets, but their impact on bystanders remains unclear. Therefore, we developed a counterspeech strategy framework and built \textit{Civilbot} for a mixed-method within-subjects study. Bystanders generally viewed Civilbot as credible and normative, though its shallow reasoning limited persuasiveness. Its behavioural effects were subtle: when performing well, it could guide participation or act as a stand-in; when performing poorly, it could discourage bystanders or motivate them to step in. Strategy proved critical: cognitive strategies that appeal to reason, especially when paired with a positive tone, were relatively effective, while mismatch of contexts and strategies could weaken impact. Based on these findings, we offer design insights for mobilizing bystanders and shaping online discourse, highlighting when to intervene and how to do so through reasoning-driven and context-aware strategies.