Unified Game Moderation: Soft-Prompting and LLM-Assisted Label Transfer for Resource-Efficient Toxicity Detection

📅 2025-06-01
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Toxicity detection in gaming communities faces three key challenges: poor cross-game generalization, scarcity of multilingual annotated data, and high computational overhead for real-time inference. This paper proposes a lightweight, multi-game, multilingual toxicity detection framework. First, it introduces a novel game-context-aware learnable soft prompt mechanism, enabling seamless adaptation of a single model across diverse gaming environments. Second, it designs an LLM-driven zero-shot label transfer framework, achieving—for the first time—superior performance over English baselines on low-resource languages. Built upon an optimized BERT architecture and trained with a macro-F1-oriented objective, the model supports seven languages. On French, German, Portuguese, and Russian test sets, it attains macro-F1 scores of 32.96%–58.88%, with German notably outperforming the English benchmark. Deployed in production, it accurately identifies ~50 toxic players per game daily, while exhibiting low inference latency, minimal resource consumption, and significantly reduced maintenance costs.

Technology Category

Application Category

📝 Abstract
Toxicity detection in gaming communities faces significant scaling challenges when expanding across multiple games and languages, particularly in real-time environments where computational efficiency is crucial. We present two key findings to address these challenges while building upon our previous work on ToxBuster, a BERT-based real-time toxicity detection system. First, we introduce a soft-prompting approach that enables a single model to effectively handle multiple games by incorporating game-context tokens, matching the performance of more complex methods like curriculum learning while offering superior scalability. Second, we develop an LLM-assisted label transfer framework using GPT-4o-mini to extend support to seven additional languages. Evaluations on real game chat data across French, German, Portuguese, and Russian achieve macro F1-scores ranging from 32.96% to 58.88%, with particularly strong performance in German, surpassing the English benchmark of 45.39%. In production, this unified approach significantly reduces computational resources and maintenance overhead compared to maintaining separate models for each game and language combination. At Ubisoft, this model successfully identifies an average of 50 players, per game, per day engaging in sanctionable behavior.
Problem

Research questions and friction points this paper is trying to address.

Scaling toxicity detection across multiple games and languages efficiently
Maintaining real-time performance with reduced computational resources
Achieving high accuracy in diverse languages and game contexts
Innovation

Methods, ideas, or system contributions that make the work stand out.

Soft-prompting enables multi-game toxicity detection
LLM-assisted label transfer for multilingual support
Unified model reduces computational resources significantly
🔎 Similar Papers
No similar papers found.