🤖 AI Summary
This study reveals systematic bias in mainstream emotion detection models toward African American Vernacular English (AAVE): anger misclassification rates are nearly double those for General American English (GAE), peaking at 60%, thereby reinforcing racial stereotypes. To address the dual challenges of scarce annotated data and cultural disconnect, we propose a “community-informed” silver-labeling paradigm—collaboratively developed by AAVE native speakers to produce high-quality, culturally grounded annotations. Integrating GPT, BERT, and SpanEmo with computational linguistic features and linear regression, we quantitatively demonstrate, for the first time, a strong positive correlation between AAVE dialect density and anger misclassification. Furthermore, we find AI-predicted anger probability significantly correlates with local Black population proportion. This work establishes a methodological foundation and empirical evidence for developing culturally equitable emotion AI systems.
📝 Abstract
Automated emotion detection is widely used in applications ranging from well-being monitoring to high-stakes domains like mental health and hiring. However, models often rely on annotations that reflect dominant cultural norms, limiting model ability to recognize emotional expression in dialects often excluded from training data distributions, such as African American Vernacular English (AAVE). This study examines emotion recognition model performance on AAVE compared to General American English (GAE). We analyze 2.7 million tweets geo-tagged within Los Angeles. Texts are scored for strength of AAVE using computational approximations of dialect features. Annotations of emotion presence and intensity are collected on a dataset of 875 tweets with both high and low AAVE densities. To assess model accuracy on a task as subjective as emotion perception, we calculate community-informed"silver"labels where AAVE-dense tweets are labeled by African American, AAVE-fluent (ingroup) annotators. On our labeled sample, GPT and BERT-based models exhibit false positive prediction rates of anger on AAVE more than double than on GAE. SpanEmo, a popular text-based emotion model, increases false positive rates of anger from 25 percent on GAE to 60 percent on AAVE. Additionally, a series of linear regressions reveals that models and non-ingroup annotations are significantly more correlated with profanity-based AAVE features than ingroup annotations. Linking Census tract demographics, we observe that neighborhoods with higher proportions of African American residents are associated with higher predictions of anger (Pearson's correlation r = 0.27) and lower joy (r = -0.10). These results find an emergent safety issue of emotion AI reinforcing racial stereotypes through biased emotion classification. We emphasize the need for culturally and dialect-informed affective computing systems.