🤖 AI Summary
This work addresses the cross-cultural challenge of interpreting affective expressions in African languages. We propose the first systematic cross-lingual emotion analysis framework, grounded in corpora spanning 15 African languages. Our methodology integrates multidimensional comparative analysis across four axes—text length, sentiment polarity, emotion co-occurrence, and intensity distribution—leveraging statistical linguistic modeling, cross-lingual co-occurrence network construction, and intensity modality identification. Key findings include the first empirical evidence of language-family-level divergence in emotion intensity distributions between Bantu and Afroasiatic languages; discovery of cross-linguistically stable affective associations (e.g., anger–disgust); and corpus-based validation that Somali texts exhibit maximal length, Nigerian languages show pronounced negative bias, while Zulu and Xhosa lean toward neutrality. These results provide a theoretical foundation for culture-aware emotion detection and enable transfer learning for inclusive NLP systems.
📝 Abstract
Understanding how emotions are expressed across languages is vital for building culturally-aware and inclusive NLP systems. However, emotion expression in African languages is understudied, limiting the development of effective emotion detection tools in these languages. In this work, we present a cross-linguistic analysis of emotion expression in 15 African languages. We examine four key dimensions of emotion representation: text length, sentiment polarity, emotion co-occurrence, and intensity variations. Our findings reveal diverse language-specific patterns in emotional expression -- with Somali texts typically longer, while others like IsiZulu and Algerian Arabic show more concise emotional expression. We observe a higher prevalence of negative sentiment in several Nigerian languages compared to lower negativity in languages like IsiXhosa. Further, emotion co-occurrence analysis demonstrates strong cross-linguistic associations between specific emotion pairs (anger-disgust, sadness-fear), suggesting universal psychological connections. Intensity distributions show multimodal patterns with significant variations between language families; Bantu languages display similar yet distinct profiles, while Afroasiatic languages and Nigerian Pidgin demonstrate wider intensity ranges. These findings highlight the need for language-specific approaches to emotion detection while identifying opportunities for transfer learning across related languages.