🤖 AI Summary
Ukrainian sentiment classification has long suffered from the absence of publicly available benchmark datasets. To address this gap, we introduce EmoBench-UA—the first high-quality, expert-validated sentiment detection benchmark for Ukrainian. Our methodology integrates linguistically informed crowd annotation via Toloka.ai, cross-lingual validation using English-to-Ukrainian translation-based synthetic data, and comprehensive evaluation across fine-tuned and zero-shot large language models (LLMs) of varying scales. Crucially, we localize and adapt established English sentiment frameworks—rigorously validated by native Ukrainian linguists. Experimental results reveal substantial performance degradation for both cross-lingual transfer and LLM-based approaches on Ukrainian compared to dominant languages, underscoring the critical need for language-specific resources. EmoBench-UA fills a foundational gap in Ukrainian NLP evaluation infrastructure, providing a standardized, reproducible benchmark to support future model development and equitable, language-aware assessment.
📝 Abstract
While Ukrainian NLP has seen progress in many texts processing tasks, emotion classification remains an underexplored area with no publicly available benchmark to date. In this work, we introduce EmoBench-UA, the first annotated dataset for emotion detection in Ukrainian texts. Our annotation schema is adapted from the previous English-centric works on emotion detection (Mohammad et al., 2018; Mohammad, 2022) guidelines. The dataset was created through crowdsourcing using the Toloka.ai platform ensuring high-quality of the annotation process. Then, we evaluate a range of approaches on the collected dataset, starting from linguistic-based baselines, synthetic data translated from English, to large language models (LLMs). Our findings highlight the challenges of emotion classification in non-mainstream languages like Ukrainian and emphasize the need for further development of Ukrainian-specific models and training resources.