Investigating the Impact of Language-Adaptive Fine-Tuning on Sentiment Analysis in Hausa Language Using AfriBERTa

📅 2025-01-19

📈 Citations: 1

✨ Influential: 0

career value

165K/year

🤖 AI Summary

To address the suboptimal adaptation of pretrained models for sentiment analysis in low-resource languages like Hausa, this paper proposes Language-Adaptive Fine-Tuning (LAFT): first performing unsupervised domain- and language-specific adaptation of AfriBERTa on unlabeled Hausa corpora, followed by supervised fine-tuning on the NaijaSenti dataset. This work represents the first application of LAFT to Hausa sentiment analysis, explicitly accounting for linguistic characteristics of informal social media text. Experiments demonstrate consistent, modest performance gains from LAFT; AfriBERTa substantially outperforms multilingual baselines without language-specific adaptation, underscoring the critical role of language-targeted pretraining in low-resource settings. All data and code are publicly released to advance NLP research for African languages.

Technology Category

Application Category

📝 Abstract

Sentiment analysis (SA) plays a vital role in Natural Language Processing (NLP) by ~identifying sentiments expressed in text. Although significant advances have been made in SA for widely spoken languages, low-resource languages such as Hausa face unique challenges, primarily due to a lack of digital resources. This study investigates the effectiveness of Language-Adaptive Fine-Tuning (LAFT) to improve SA performance in Hausa. We first curate a diverse, unlabeled corpus to expand the model's linguistic capabilities, followed by applying LAFT to adapt AfriBERTa specifically to the nuances of the Hausa language. The adapted model is then fine-tuned on the labeled NaijaSenti sentiment dataset to evaluate its performance. Our findings demonstrate that LAFT gives modest improvements, which may be attributed to the use of formal Hausa text rather than informal social media data. Nevertheless, the pre-trained AfriBERTa model significantly outperformed models not specifically trained on Hausa, highlighting the importance of using pre-trained models in low-resource contexts. This research emphasizes the necessity for diverse data sources to advance NLP applications for low-resource African languages. We published the code and the dataset to encourage further research and facilitate reproducibility in low-resource NLP here: https://github.com/Sani-Abdullahi-Sani/Natural-Language-Processing/blob/main/Sentiment%20Analysis%20for%20Low%20Resource%20African%20Languages

Problem

Research questions and friction points this paper is trying to address.

AfriBERTa

Hausa sentiment analysis

Low-resource languages

Innovation

Methods, ideas, or system contributions that make the work stand out.

LAFT Method

AfriBERTa Model

Hausa Sentiment Analysis

🔎 Similar Papers

No similar papers found.

💼 Related Jobs

AI Language Engineer

Cresta

$90,000–$160,000 + Offers Equity

United States (Remote) / US (Remote)

Authors to Follow