🤖 AI Summary
This work addresses the challenge of idiom and metaphor classification in the low-resource language Konkani. We propose a hybrid architecture integrating mBERT with a BiLSTM layer and introduce, for the first time, a gradient-driven attention-head pruning strategy systematically applied to this task—significantly improving inference efficiency without compromising performance. Supervised fine-tuning is conducted on a newly curated, manually annotated Konkani metaphor/idiom dataset. Experiments yield 78% accuracy on metaphor classification and 83% on idiom classification. Our contribution is twofold: (1) we deliver an efficient, deployable solution for figurative language understanding in low-resource settings; and (2) we establish a novel, structured pruning paradigm tailored to low-resource NLP tasks, empirically validating its effectiveness and generalizability under severe computational and data constraints.
📝 Abstract
In this paper, we address the persistent challenges that figurative language expressions pose for natural language processing (NLP) systems, particularly in low-resource languages such as Konkani. We present a hybrid model that integrates a pre-trained Multilingual BERT (mBERT) with a bidirectional LSTM and a linear classifier. This architecture is fine-tuned on a newly introduced annotated dataset for metaphor classification, developed as part of this work. To improve the model's efficiency, we implement a gradient-based attention head pruning strategy. For metaphor classification, the pruned model achieves an accuracy of 78%. We also applied our pruning approach to expand on an existing idiom classification task, achieving 83% accuracy. These results demonstrate the effectiveness of attention head pruning for building efficient NLP tools in underrepresented languages.