๐ค AI Summary
This work addresses the challenge of effectively unlearning specific factual knowledge in multilingual large language models, particularly in cross-lingual settings where existing unlearning methods exhibit significant performance degradation in non-training languages. To this end, the authors construct a multilingual TOFU benchmark spanning seven languages and scripts, enabling a systematic evaluation of mainstream unlearning algorithms across languages. Their analysis reveals, for the first time, the existence of a shared โinterlingual spaceโ within multilingual models. Leveraging this insight, they propose a subspace projection-based approach to achieve efficient and selective cross-lingual unlearning. Experimental results demonstrate that the proposed method substantially improves unlearning efficacy in non-training languages while preserving overall model utility, outperforming current state-of-the-art unlearning strategies.
๐ Abstract
We present the first comprehensive evaluation of cross-lingual unlearning in multilingual LLMs. Using translated TOFU benchmarks in seven language/script variants, we test major unlearning algorithms and show that most fail to remove facts outside the training language, even when utility remains high. However, subspace-projection consistently outperforms the other methods, achieving strong cross-lingual forgetting with minimal degradation. Analysis of learned task subspaces reveals a shared interlingua structure: removing this shared subspace harms all languages, while removing language-specific components selectively affects one. These results demonstrate that multilingual forgetting depends on geometry in weight space, motivating subspace-based approaches for future unlearning systems.