LNE-Blocking: An Efficient Framework for Contamination Mitigation Evaluation on Large Language Models

📅 2025-09-18

📈 Citations: 0

✨ Influential: 0

career value

151K/year

🤖 AI Summary

Data contamination is pervasive in large language model (LLM) training, leading to inflated and unreliable benchmark evaluations. To address this, we propose LNE-Blocking—a framework for fair evaluation recovery without requiring uncontaminated datasets. Methodologically, it first employs an LNE (Leakage-Normalized Evaluation) detector to quantify model memorization of contaminated data; then introduces an adaptive blocking mechanism that perturbs memory-driven responses during greedy decoding—marking the first effective performance restoration under standard autoregressive inference. The framework demonstrates robustness across diverse LLM architectures, contamination severities, and benchmark datasets. Experiments on multiple high-risk leakage benchmarks—including RealNews, C4, and Pile subsets—show substantial improvements in evaluation fidelity and reliability. Code is publicly available.

Technology Category

Application Category

📝 Abstract

The problem of data contamination is now almost inevitable during the development of large language models (LLMs), with the training data commonly integrating those evaluation benchmarks even unintentionally. This problem subsequently makes it hard to benchmark LLMs fairly. Instead of constructing contamination-free datasets (quite hard), we propose a novel framework, extbf{LNE-Blocking}, to restore model performance prior to contamination on potentially leaked datasets. Our framework consists of two components: contamination detection and disruption operation. For the prompt, the framework first uses the contamination detection method, extbf{LNE}, to assess the extent of contamination in the model. Based on this, it adjusts the intensity of the disruption operation, extbf{Blocking}, to elicit non-memorized responses from the model. Our framework is the first to efficiently restore the model's greedy decoding performance. This comes with a strong performance on multiple datasets with potential leakage risks, and it consistently achieves stable recovery results across different models and varying levels of data contamination. We release the code at https://github.com/RuijieH/LNE-Blocking to facilitate research.

Problem

Research questions and friction points this paper is trying to address.

Mitigates data contamination in large language models

Restores model performance on potentially leaked datasets

Detects contamination and disrupts memorized responses

Innovation

Methods, ideas, or system contributions that make the work stand out.

LNE-Blocking framework for contamination mitigation

Detects contamination and adjusts disruption intensity

Restores model performance with greedy decoding

🔎 Similar Papers

A Comprehensive Survey of Contamination Detection Methods in Large Language Models