University of Indonesia at SemEval-2025 Task 11: Evaluating State-of-the-Art Encoders for Multi-Label Emotion Detection

📅 2025-05-22

📈 Citations: 0

✨ Influential: 0

career value

200K/year

🤖 AI Summary

This study addresses multilingual multi-label sentiment detection across 28 languages. Method: We systematically evaluate leading multilingual encoders—including mE5, BGE, XLM-R, and mBERT—under both prompt-based encoding and full-parameter fine-tuning paradigms. We propose an ensemble framework leveraging multi-configuration BGE embeddings, fused via a CatBoost classifier optimized for macro-F1 in multi-label settings. Contribution/Results: We first observe that frozen prompt encoders (e.g., mE5/BGE) paired with lightweight classifiers substantially outperform fully fine-tuned XLM-R or mBERT. Our final system achieves a mean macro-F1 of 56.58 on SemEval-2025 Task 11 Track A, ranking among the top submissions. This demonstrates the effectiveness, efficiency, and scalability of lightweight, prompt-driven multilingual sentiment modeling—offering a viable alternative to resource-intensive full fine-tuning.

Technology Category

Application Category

📝 Abstract

This paper presents our approach for SemEval 2025 Task 11 Track A, focusing on multilabel emotion classification across 28 languages. We explore two main strategies: fully fine-tuning transformer models and classifier-only training, evaluating different settings such as fine-tuning strategies, model architectures, loss functions, encoders, and classifiers. Our findings suggest that training a classifier on top of prompt-based encoders such as mE5 and BGE yields significantly better results than fully fine-tuning XLMR and mBERT. Our best-performing model on the final leaderboard is an ensemble combining multiple BGE models, where CatBoost serves as the classifier, with different configurations. This ensemble achieves an average F1-macro score of 56.58 across all languages.

Problem

Research questions and friction points this paper is trying to address.

Multilabel emotion classification across 28 languages

Comparing fine-tuning vs classifier-only training strategies

Evaluating encoder and classifier performance in emotion detection

Innovation

Methods, ideas, or system contributions that make the work stand out.

Fine-tuning transformer models for emotion detection

Using prompt-based encoders like mE5 and BGE

Ensemble model with CatBoost classifier for performance

🔎 Similar Papers

OV-MER: Towards Open-Vocabulary Multimodal Emotion Recognition