LNE-Blocking: An Efficient Framework for Contamination Mitigation Evaluation on Large Language Models

📅 2025-09-18
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Data contamination is pervasive in large language model (LLM) training, leading to inflated and unreliable benchmark evaluations. To address this, we propose LNE-Blocking—a framework for fair evaluation recovery without requiring uncontaminated datasets. Methodologically, it first employs an LNE (Leakage-Normalized Evaluation) detector to quantify model memorization of contaminated data; then introduces an adaptive blocking mechanism that perturbs memory-driven responses during greedy decoding—marking the first effective performance restoration under standard autoregressive inference. The framework demonstrates robustness across diverse LLM architectures, contamination severities, and benchmark datasets. Experiments on multiple high-risk leakage benchmarks—including RealNews, C4, and Pile subsets—show substantial improvements in evaluation fidelity and reliability. Code is publicly available.

Technology Category

Application Category

📝 Abstract
The problem of data contamination is now almost inevitable during the development of large language models (LLMs), with the training data commonly integrating those evaluation benchmarks even unintentionally. This problem subsequently makes it hard to benchmark LLMs fairly. Instead of constructing contamination-free datasets (quite hard), we propose a novel framework, extbf{LNE-Blocking}, to restore model performance prior to contamination on potentially leaked datasets. Our framework consists of two components: contamination detection and disruption operation. For the prompt, the framework first uses the contamination detection method, extbf{LNE}, to assess the extent of contamination in the model. Based on this, it adjusts the intensity of the disruption operation, extbf{Blocking}, to elicit non-memorized responses from the model. Our framework is the first to efficiently restore the model's greedy decoding performance. This comes with a strong performance on multiple datasets with potential leakage risks, and it consistently achieves stable recovery results across different models and varying levels of data contamination. We release the code at https://github.com/RuijieH/LNE-Blocking to facilitate research.
Problem

Research questions and friction points this paper is trying to address.

Mitigates data contamination in large language models
Restores model performance on potentially leaked datasets
Detects contamination and disrupts memorized responses
Innovation

Methods, ideas, or system contributions that make the work stand out.

LNE-Blocking framework for contamination mitigation
Detects contamination and adjusts disruption intensity
Restores model performance with greedy decoding
🔎 Similar Papers
No similar papers found.