MemHunter: Automated and Verifiable Memorization Detection at Dataset-scale in LLMs

📅 2024-12-10

🏛️ arXiv.org

📈 Citations: 0

✨ Influential: 0

career value

222K/year

🤖 AI Summary

This work addresses privacy risks arising from training data memorization in large language models (LLMs). We propose the first verifiable memorization detection method scalable to full datasets and operating without sample-level prompt engineering. Methodologically, we construct a memory-inducing LLM and integrate it with a statistical hypothesis testing framework to enable prompt-free, batch-wise detection; an ensemble-based plugin architecture further enhances deployment flexibility. Evaluated on Pythia and Llama models, our approach identifies 40% more memorized training instances than prior methods while reducing search time by 80%, substantially improving the feasibility and trustworthiness of large-scale model privacy auditing. Our core contribution is the first dataset-level, automated, verifiable, and computationally efficient memorization detection paradigm—establishing a new standard for rigorous, scalable privacy assessment of LLMs.

Technology Category

Application Category

📝 Abstract

Large language models (LLMs) have been shown to memorize and reproduce content from their training data, raising significant privacy concerns, especially with web-scale datasets. Existing methods for detecting memorization are primarily sample-specific, relying on manually crafted or discretely optimized memory-inducing prompts generated on a per-sample basis, which become impractical for dataset-level detection due to the prohibitive computational cost of iterating through all samples. In real-world scenarios, data owners may need to verify whether a susceptible LLM has memorized their dataset, particularly if the LLM may have collected the data from the web without authorization. To address this, we introduce MemHunter, which trains a memory-inducing LLM and employs hypothesis testing to efficiently detect memorization at the dataset level, without requiring sample-specific memory inducing. Experiments on models like Pythia and Llama demonstrate that MemHunter can extract up to 40% more training data than existing methods under constrained time resources and reduce search time by up to 80% when integrated as a plug-in. Crucially, MemHunter is the first method capable of dataset-level memorization detection, providing a critical tool for assessing privacy risks in LLMs powered by large-scale datasets.

Problem

Research questions and friction points this paper is trying to address.

Detect memorization in large language models

Automate dataset-scale memorization verification

Reduce computational cost for privacy assessment

Innovation

Methods, ideas, or system contributions that make the work stand out.

Automated dataset-level memorization detection

Memory-inducing LLM with hypothesis testing

40% more data extraction efficiency

🔎 Similar Papers

Undesirable Memorization in Large Language Models: A Survey