Exploring Subjective Tasks in Farsi: A Survey Analysis and Evaluation of Language Models

📅 2025-09-06

📈 Citations: 0

✨ Influential: 0

career value

209K/year

🤖 AI Summary

Persian (Farsi), as a medium-resource language, faces systemic bottlenecks—such as data scarcity, low annotation quality, and absence of demographic metadata (e.g., age, gender)—in subjective NLP tasks including sentiment analysis, emotion recognition, and toxicity detection, leading to unstable model performance and poor generalization. This study presents the first systematic literature review and empirical evaluation of 110 relevant works, incorporating cross-model and cross-dataset stability analyses to identify core issues: low dataset availability, inconsistent labeling practices, and insufficient linguistic and demographic diversity. Crucially, it demonstrates that scaling data volume alone does not improve NLP performance; instead, enhancing data representativeness and annotation rigor is paramount. The work contributes a reproducible evaluation framework and practical guidelines for constructing high-quality Persian subjective-task datasets, challenging the conventional assumption that “medium-resource” implies merely moderate data quantity.

Technology Category

Application Category

📝 Abstract

Given Farsi's speaker base of over 127 million people and the growing availability of digital text, including more than 1.3 million articles on Wikipedia, it is considered a middle-resource language. However, this label quickly crumbles when the situation is examined more closely. We focus on three subjective tasks (Sentiment Analysis, Emotion Analysis, and Toxicity Detection) and find significant challenges in data availability and quality, despite the overall increase in data availability. We review 110 publications on subjective tasks in Farsi and observe a lack of publicly available datasets. Furthermore, existing datasets often lack essential demographic factors, such as age and gender, that are crucial for accurately modeling subjectivity in language. When evaluating prediction models using the few available datasets, the results are highly unstable across both datasets and models. Our findings indicate that the volume of data is insufficient to significantly improve a language's prospects in NLP.

Problem

Research questions and friction points this paper is trying to address.

Addressing data scarcity in Farsi subjective NLP tasks

Evaluating model instability across limited Farsi datasets

Analyzing missing demographic factors in Farsi language data

Innovation

Methods, ideas, or system contributions that make the work stand out.

Surveyed 110 Farsi subjective task publications comprehensively

Evaluated language models on sentiment emotion toxicity detection

Analyzed data scarcity and demographic factor limitations

🔎 Similar Papers

No similar papers found.