Enhancing Small Language Models for Cross-Lingual Generalized Zero-Shot Classification with Soft Prompt Tuning

📅 2025-03-25

📈 Citations: 0

✨ Influential: 0

career value

180K/year

🤖 AI Summary

Existing methods for cross-lingual generalized zero-shot classification (ZSC) in low-resource languages suffer from weak generalization, poor adaptation of multilingual prompts, and inefficient utilization of limited labeled data. Method: We propose RoSPrompt—the first lightweight soft-prompting approach tailored for small-scale multilingual pretrained language models (mPLMs). It requires no handcrafted language-specific prompts nor large-scale fine-tuning. Its core innovation lies in jointly integrating soft prompt tuning, cross-lingual knowledge transfer, and distribution-shift-robust optimization to enable robust cross-lingual training. Results: Evaluated across multiple benchmarks spanning 106 languages, RoSPrompt significantly advances zero-shot cross-lingual classification performance: it improves average accuracy on low-resource languages by 12.7% and reduces inference parameter count by over 99%, demonstrating superior efficiency and effectiveness.

Technology Category

Application Category

📝 Abstract

In NLP, Zero-Shot Classification (ZSC) has become essential for enabling models to classify text into categories unseen during training, particularly in low-resource languages and domains where labeled data is scarce. While pretrained language models (PLMs) have shown promise in ZSC, they often rely on large training datasets or external knowledge, limiting their applicability in multilingual and low-resource scenarios. Recent approaches leveraging natural language prompts reduce the dependence on large training datasets but struggle to effectively incorporate available labeled data from related classification tasks, especially when these datasets originate from different languages or distributions. Moreover, existing prompt-based methods typically rely on manually crafted prompts in a specific language, limiting their adaptability and effectiveness in cross-lingual settings. To address these challenges, we introduce RoSPrompt, a lightweight and data-efficient approach for training soft prompts that enhance cross-lingual ZSC while ensuring robust generalization across data distribution shifts. RoSPrompt is designed for small multilingual PLMs, enabling them to leverage high-resource languages to improve performance in low-resource settings without requiring extensive fine-tuning or high computational costs. We evaluate our approach on multiple multilingual PLMs across datasets covering 106 languages, demonstrating strong cross-lingual transfer performance and robust generalization capabilities over unseen classes.

Problem

Research questions and friction points this paper is trying to address.

Enhancing small multilingual models for zero-shot classification

Reducing reliance on large training datasets in low-resource languages

Improving cross-lingual adaptability without manual prompt crafting

Innovation

Methods, ideas, or system contributions that make the work stand out.

Soft prompt tuning for cross-lingual ZSC

Lightweight multilingual PLM adaptation

Data-efficient transfer learning

🔎 Similar Papers

No similar papers found.