EmoHopeSpeech: An Annotated Dataset of Emotions and Hope Speech in English

📅 2025-05-17
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
A scarcity of fine-grained, bilingual (Arabic–English) datasets jointly annotated for multi-emotion (emotion + hope) hinders cross-lingual affective computing and hope speech analysis. Method: We introduce the first bilingual multi-emotion–hope speech dataset (33,492 instances), featuring concurrent annotation of emotion intensity, complexity, causal attribution, and hierarchical hope categories. To ensure reliability, we implement a cross-lingual consistency verification protocol (Fleiss’ Kappa = 0.75–0.85) and an expert-augmented collaborative annotation framework. Dataset quality is empirically validated via machine learning baselines (micro-F1 = 0.67). Contribution/Results: This resource alleviates the critical data bottleneck in joint multi-emotion–hope modeling, enabling cross-lingual affect–hope association studies and robust NLP model training. It serves as a foundational benchmark for computational social science and interpretable affective computing.

Technology Category

Application Category

📝 Abstract
This research introduces a bilingual dataset comprising 23,456 entries for Arabic and 10,036 entries for English, annotated for emotions and hope speech, addressing the scarcity of multi-emotion (Emotion and hope) datasets. The dataset provides comprehensive annotations capturing emotion intensity, complexity, and causes, alongside detailed classifications and subcategories for hope speech. To ensure annotation reliability, Fleiss' Kappa was employed, revealing 0.75-0.85 agreement among annotators both for Arabic and English language. The evaluation metrics (micro-F1-Score=0.67) obtained from the baseline model (i.e., using a machine learning model) validate that the data annotations are worthy. This dataset offers a valuable resource for advancing natural language processing in underrepresented languages, fostering better cross-linguistic analysis of emotions and hope speech.
Problem

Research questions and friction points this paper is trying to address.

Addresses scarcity of multi-emotion datasets in Arabic and English
Provides annotations for emotion intensity, complexity, and causes
Enables cross-linguistic NLP analysis of emotions and hope speech
Innovation

Methods, ideas, or system contributions that make the work stand out.

Bilingual dataset with emotion and hope annotations
Fleiss' Kappa ensures high annotation reliability
Baseline model validates data annotation quality
🔎 Similar Papers
No similar papers found.