Mass-Editing Memory with Attention in Transformers: A cross-lingual exploration of knowledge

📅 2025-02-04

🏛️ Findings of the Association for Computational Linguistics ACL 2024

📈 Citations: 0

✨ Influential: 0

career value

181K/year

🤖 AI Summary

This work addresses the low cross-lingual factual knowledge editing efficiency and poor generalization of large language models (LLMs). We propose MEMAT, a lightweight memory-augmented editing method. MEMAT is the first to systematically uncover the pivotal role of Transformer attention mechanisms in cross-lingual knowledge localization and update. It integrates attention-driven knowledge localization, multilingual consistency constraints, and parameter-efficient fine-tuning—specifically sparse LoRA updates—to achieve global performance gains with minimal parameter adjustments. MEMAT enables zero-shot language transfer without task-specific training, improving knowledge editing accuracy by +10% and achieving state-of-the-art (SOTA) results across all evaluation metrics. Moreover, it exhibits strong model and architectural portability, seamlessly adapting to diverse LLMs and encoder architectures.

Technology Category

Application Category

📝 Abstract

Recent research has explored methods for updating and modifying factual knowledge in large language models, often focusing on specific multi-layer perceptron blocks. This study expands on this work by examining the effectiveness of existing knowledge editing methods across languages and delving into the role of attention mechanisms in this process. Drawing from the insights gained, we propose Mass-Editing Memory with Attention in Transformers (MEMAT), a method that achieves significant improvements in all metrics while requiring minimal parameter modifications. MEMAT delivers a remarkable 10% increase in magnitude metrics, benefits languages not included in the training data and also demonstrates a high degree of portability. Our code and data are at https://github.com/dtamayo-nlp/MEMAT.

Problem

Research questions and friction points this paper is trying to address.

Cross-lingual knowledge editing

Attention mechanisms in transformers

Minimal parameter modifications

Innovation

Methods, ideas, or system contributions that make the work stand out.

Mass-Editing Memory with Attention

Cross-lingual knowledge editing

Minimal parameter modifications

🔎 Similar Papers

Relation Also Knows: Rethinking the Recall and Editing of Factual Associations in Auto-Regressive Transformer Language Models