🤖 AI Summary
Social media platforms face the dual challenge of ensuring policy compliance while accounting for users’ subjective perceptions in detecting gender-based discrimination. This paper introduces the first demographic-aware personalized detection framework, departing from conventional “gold-standard” aggregation paradigms. By injecting demographic instructions into large language models (LLMs), our approach establishes an interpretable and controllable persona-based detection paradigm. The method integrates prompt engineering, demographic embedding, and multi-source Twitter demographic data modeling to explicitly preserve diverse subjective annotations. Evaluated on real-world data, the framework improves F1-score by 14.2% for gender discrimination detection sensitive to marginalized groups. Its outputs are grounded in traceable demographic evidence and exhibit group-level consistency. This work advances platform governance by offering a fairness-aware, transparent, and human-centered alternative to monolithic moderation systems.
📝 Abstract
Social media platforms must filter sexist content in compliance with governmental regulations. Current machine learning approaches can reliably detect sexism based on standardized definitions, but often neglect the subjective nature of sexist language and fail to consider individual users' perspectives. To address this gap, we adopt a perspectivist approach, retaining diverse annotations rather than enforcing gold-standard labels or their aggregations, allowing models to account for personal or group-specific views of sexism. Using demographic data from Twitter, we employ large language models (LLMs) to personalize the identification of sexism.