π€ AI Summary
This study addresses the challenge of detecting policy-violating content across diverse online communities, each governed by its own custom normsβa key requirement for pluralistic social media governance. The work formulates this as a multi-choice task and introduces PluRule, the first benchmark for violation detection under heterogeneous community rules, encompassing 1,989 Reddit communities, 2,885 distinct rules, and nine languages. A large-scale multimodal, multilingual evaluation dataset is constructed to support rigorous assessment. Experimental results reveal that state-of-the-art vision-language models exhibit limited performance, barely surpassing simple baselines; gains from increased model scale or extended context are marginal, while violations of more universal rules are consistently easier to identify. This research establishes a new benchmark and provides empirical insights into AIβs capacity for norm-aware reasoning in complex, rule-diverse environments.
π Abstract
Social media are shifting towards pluralism -- community-governed platforms where groups define their own norms. What violates rules in one community may be perfectly acceptable in another. Can AI models help moderate such pluralistic communities? We formalize the task as a multiple-choice problem, mirroring how human moderators operate in the real world: given a comment and its surrounding context, identify which specific rule, if any, is violated. We introduce PluRule, a multimodal, multilingual benchmark for detecting 13,371 rule violations across 1,989 Reddit communities spanning 2,885 rules in 9 languages. Using this benchmark, we show that state-of-the-art vision-language models struggle significantly: even GPT-5.2 with high reasoning performs only slightly better than a trivial baseline. We also find that bigger models and increased context provide marginal gains, and universal rules like civility and self-promotion are easier to detect. Our results show that moderation of pluralistic communities on social media is a fundamental challenge for language models. Our code and benchmark are publicly available.