News Without Borders: Domain Adaptation of Multilingual Sentence Embeddings for Cross-lingual News Recommendation

📅 2024-06-18

🏛️ arXiv.org

📈 Citations: 2

✨ Influential: 0

career value

186K/year

🤖 AI Summary

To address poor performance, high computational cost, and data scarcity in zero-shot cross-lingual transfer and cold-start scenarios for multilingual news recommendation, this paper proposes the News-adapted Multilingual Sentence Encoder (NaSE). NaSE is the first large-scale multilingual sentence encoder specialized for news recommendation, leveraging domain-adaptive pretraining on our newly constructed, news-specific multilingual corpus—PolyNews. Crucially, NaSE adopts a frozen-encoder architecture with late fusion of user click behavior, eliminating the need for supervised fine-tuning and substantially reducing computational overhead. Evaluated on real-world cold-start and few-shot cross-lingual news recommendation benchmarks, NaSE achieves state-of-the-art performance, significantly outperforming existing zero-shot transfer methods across multiple languages and low-resource settings.

Technology Category

Application Category

📝 Abstract

Rapidly growing numbers of multilingual news consumers pose an increasing challenge to news recommender systems in terms of providing customized recommendations. First, existing neural news recommenders, even when powered by multilingual language models (LMs), suffer substantial performance losses in zero-shot cross-lingual transfer (ZS-XLT). Second, the current paradigm of fine-tuning the backbone LM of a neural recommender on task-specific data is computationally expensive and infeasible in few-shot recommendation and cold-start setups, where data is scarce or completely unavailable. In this work, we propose a news-adapted sentence encoder (NaSE), domain-specialized from a pretrained massively multilingual sentence encoder (SE). To this end, we construct and leverage PolyNews and PolyNewsParallel, two multilingual news-specific corpora. With the news-adapted multilingual SE in place, we test the effectiveness of (i.e., question the need for) supervised fine-tuning for news recommendation, and propose a simple and strong baseline based on (i) frozen NaSE embeddings and (ii) late click-behavior fusion. We show that NaSE achieves state-of-the-art performance in ZS-XLT in true cold-start and few-shot news recommendation.

Problem

Research questions and friction points this paper is trying to address.

Multilingual Recommendation

Resource-limited Environment

Data Sparsity

Innovation

Methods, ideas, or system contributions that make the work stand out.

NaSE

Multi-lingual News Recommendation

PolyNews Corpus

🔎 Similar Papers

No similar papers found.