Char-mander Use mBackdoor! A Study of Cross-lingual Backdoor Attacks in Multilingual LLMs

📅 2025-02-24
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work identifies a cross-lingual backdoor attack vulnerability—termed X-BAT—in multilingual large language models (mLLMs), arising from shared embedding spaces: poisoning monolingual data induces automatic backdoor behavior transfer across languages. We propose a novel attack paradigm leveraging rare words as highly stealthy triggers and develop a poisoning framework grounded in toxic classification, integrating embedding space analysis, cross-lingual trigger word mining, and robustness evaluation. Extensive experiments across multiple state-of-the-art mLLMs demonstrate cross-lingual backdoor transfer success rates exceeding 85%. Crucially, this study provides the first systematic empirical evidence that embedding space sharing is the key mechanism enabling such cross-lingual backdoor migration, thereby offering a new perspective for security assessment of mLLMs. The code and dataset are publicly released.

Technology Category

Application Category

📝 Abstract
We explore Cross-lingual Backdoor ATtacks (X-BAT) in multilingual Large Language Models (mLLMs), revealing how backdoors inserted in one language can automatically transfer to others through shared embedding spaces. Using toxicity classification as a case study, we demonstrate that attackers can compromise multilingual systems by poisoning data in a single language, with rare tokens serving as specific effective triggers. Our findings expose a critical vulnerability in the fundamental architecture that enables cross-lingual transfer in these models. Our code and data are publicly available at https://github.com/himanshubeniwal/X-BAT.
Problem

Research questions and friction points this paper is trying to address.

Cross-lingual backdoor attacks in multilingual LLMs
Toxicity classification vulnerability through single-language data poisoning
Shared embedding spaces enable automatic backdoor transfer
Innovation

Methods, ideas, or system contributions that make the work stand out.

Cross-lingual Backdoor Attacks
Shared embedding spaces transfer
Rare tokens as triggers
🔎 Similar Papers
No similar papers found.
Himanshu Beniwal
Himanshu Beniwal
Indian Institute of Technology Gandhinagar
Natural Language ProcessingMachine LearningComputational LinguisticsDeep Learning
S
Sailesh Panda
Indian Institute of Technology Gandhinagar
M
Mayank Singh
Indian Institute of Technology Gandhinagar