Detecting Hope Across Languages: Multiclass Classification for Positive Online Discourse

📅 2025-09-30
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This study addresses the fine-grained identification of hope-related expressions in social media. We propose the first cross-lingual three-class hope classification framework—comprising General Hope, Realistic Hope, and Unrealistic Hope—covering English, Urdu, and Spanish. Leveraging XLM-RoBERTa, we fine-tune on PolyHope, a newly constructed multilingual dataset, significantly improving classification performance and model generalization—especially for low-resource languages like Urdu. Our methodological contributions are threefold: (1) the first formal definition and annotation of fine-grained, multilingual hope categories; (2) a joint training strategy explicitly designed to accommodate cross-lingual semantic variation; and (3) state-of-the-art performance on the PolyHope-M 2025 shared task, achieving superior macro-F1 scores across all languages—most notably a 4.2-point gain for Urdu. This work establishes a scalable technical foundation for modeling positive online discourse and supporting mental health interventions.

Technology Category

Application Category

📝 Abstract
The detection of hopeful speech in social media has emerged as a critical task for promoting positive discourse and well-being. In this paper, we present a machine learning approach to multiclass hope speech detection across multiple languages, including English, Urdu, and Spanish. We leverage transformer-based models, specifically XLM-RoBERTa, to detect and categorize hope speech into three distinct classes: Generalized Hope, Realistic Hope, and Unrealistic Hope. Our proposed methodology is evaluated on the PolyHope dataset for the PolyHope-M 2025 shared task, achieving competitive performance across all languages. We compare our results with existing models, demonstrating that our approach significantly outperforms prior state-of-the-art techniques in terms of macro F1 scores. We also discuss the challenges in detecting hope speech in low-resource languages and the potential for improving generalization. This work contributes to the development of multilingual, fine-grained hope speech detection models, which can be applied to enhance positive content moderation and foster supportive online communities.
Problem

Research questions and friction points this paper is trying to address.

Detecting hope speech across multiple languages
Classifying hope into three distinct categories
Improving multilingual fine-grained hope detection models
Innovation

Methods, ideas, or system contributions that make the work stand out.

Transformer-based models for multilingual hope speech detection
XLM-RoBERTa categorizes hope into three distinct classes
Outperforms state-of-the-art techniques across multiple languages
🔎 Similar Papers
No similar papers found.
T
T. O. Abiola
Instituto Politécnico Nacional, Centro de Investigación en Computación, CDMX, Mexico
K
K. D. Abiodun
Ekiti State University, Ado-Ekiti, Nigeria
O
O. E. Olumide
Instituto Politécnico Nacional, Centro de Investigación en Computación, CDMX, Mexico
O
O. O. Adebanji
Instituto Politécnico Nacional, Centro de Investigación en Computación, CDMX, Mexico
O
O. Hiram Calvo
Instituto Politécnico Nacional, Centro de Investigación en Computación, CDMX, Mexico
Grigori Sidorov
Grigori Sidorov
Professor of Computational Linguistics, Instituto Politécnico Nacional (IPN), Mexico
Computational LinguisticsNatural Language ProcessingArtificial IntelligenceMachine Learning