MAGE: Multi-Head Attention Guided Embeddings for Low Resource Sentiment Classification

📅 2025-02-25

📈 Citations: 0

✨ Influential: 0

career value

204K/year

🤖 AI Summary

To address the limited sentiment classification performance in low-resource Bantu languages—attributed to the scarcity of high-quality annotated data—this paper proposes a novel method integrating language-agnostic data augmentation (LiDA) with multi-head attention-weighted embeddings. The approach initializes the model via cross-lingual transfer learning, employs multi-head attention to dynamically assess sample importance, and enables adaptive, selective augmentation of salient instances; concurrently, weighted word/sentence embeddings enhance semantic representation. Experiments across multiple Bantu languages demonstrate substantial improvements over strong baselines, with average accuracy gains of 4.2–7.8 percentage points. This work constitutes the first integration of LiDA with attention-driven embedding weighting, establishing a scalable, robust, and zero-shot target-language-labeling paradigm for low-resource NLP.

Technology Category

Application Category

📝 Abstract

Due to the lack of quality data for low-resource Bantu languages, significant challenges are presented in text classification and other practical implementations. In this paper, we introduce an advanced model combining Language-Independent Data Augmentation (LiDA) with Multi-Head Attention based weighted embeddings to selectively enhance critical data points and improve text classification performance. This integration allows us to create robust data augmentation strategies that are effective across various linguistic contexts, ensuring that our model can handle the unique syntactic and semantic features of Bantu languages. This approach not only addresses the data scarcity issue but also sets a foundation for future research in low-resource language processing and classification tasks.

Problem

Research questions and friction points this paper is trying to address.

Low-resource Bantu languages

Text classification challenges

Data scarcity in sentiment analysis

Innovation

Methods, ideas, or system contributions that make the work stand out.

Multi-Head Attention Embeddings

Language-Independent Data Augmentation

Low-Resource Sentiment Classification

🔎 Similar Papers

No similar papers found.