AHaSIS: Shared Task on Sentiment Analysis for Arabic Dialects

📅 2025-11-17
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This study addresses the challenge of sentiment analysis in Arabic hotel reviews, where code-mixing between Modern Standard Arabic (MSA) and Saudi and Moroccan dialects (Darija) complicates linguistic modeling. To tackle this, we construct the first balanced, multi-dialectal, domain-specific dataset for the hospitality sector, manually translated and rigorously validated by native speakers. We launch the inaugural shared task on multi-dialectal Arabic sentiment analysis in hospitality, fostering practical adoption of dialect-aware NLP for customer experience analytics. The task attracted over 40 participating teams, with 12 submitting working systems; the top-performing model achieved an F1-score of 0.81. Key contributions include: (1) releasing the first native-speaker-verified, cross-dialectal hotel review dataset; (2) establishing the first benchmark for multi-dialectal sentiment classification; and (3) empirically validating the effectiveness of pretrained language models on low-resource Arabic dialects—providing essential infrastructure and empirical evidence to advance dialect adaptation research.

Technology Category

Application Category

📝 Abstract
The hospitality industry in the Arab world increasingly relies on customer feedback to shape services, driving the need for advanced Arabic sentiment analysis tools. To address this challenge, the Sentiment Analysis on Arabic Dialects in the Hospitality Domain shared task focuses on Sentiment Detection in Arabic Dialects. This task leverages a multi-dialect, manually curated dataset derived from hotel reviews originally written in Modern Standard Arabic (MSA) and translated into Saudi and Moroccan (Darija) dialects. The dataset consists of 538 sentiment-balanced reviews spanning positive, neutral, and negative categories. Translations were validated by native speakers to ensure dialectal accuracy and sentiment preservation. This resource supports the development of dialect-aware NLP systems for real-world applications in customer experience analysis. More than 40 teams have registered for the shared task, with 12 submitting systems during the evaluation phase. The top-performing system achieved an F1 score of 0.81, demonstrating the feasibility and ongoing challenges of sentiment analysis across Arabic dialects.
Problem

Research questions and friction points this paper is trying to address.

Develop sentiment analysis tools for Arabic dialects
Address sentiment detection challenges in hospitality domain
Create dialect-aware NLP systems using multi-dialect dataset
Innovation

Methods, ideas, or system contributions that make the work stand out.

Multi-dialect dataset from hotel reviews
Manual curation with native speaker validation
Sentiment-balanced translations preserving dialect accuracy
🔎 Similar Papers
No similar papers found.
M
Maram Alharbi
School of Computing and Communications, Lancaster University, UK
S
Salmane Chafik
Mohammed VI Polytechnic University, Morocco
S
Saad Ezzini
King Fahd University of Petroleum and Minerals, Saudi Arabia
Ruslan Mitkov
Ruslan Mitkov
Lancaster University
Natural Language ProcessingComputational LinguisticsDeep Learning
Tharindu Ranasinghe
Tharindu Ranasinghe
Lancaster University, UK
Natural Language ProcessingDeep LearningBenchmarking
H
Hansi Hettiarachchi
School of Computing and Communications, Lancaster University, UK