Towards Evaluation for Real-World LLM Unlearning

📅 2025-08-02
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Existing LLM unlearning evaluation metrics suffer from limited practicality, accuracy, and robustness. To address this, we propose DCUE, a distribution-calibration-based unlearning evaluation framework. DCUE identifies semantically critical tokens in model outputs, corrects confidence distribution biases via unsupervised calibration on a validation set—without requiring human annotations or strong modeling assumptions—and quantifies pre- and post-unlearning distributional shifts using the Kolmogorov–Smirnov test. Our method explicitly accounts for semantic token importance, unlike conventional metrics such as accuracy drop or KL divergence, which are noise-sensitive and agnostic to semantics. Experiments demonstrate that DCUE significantly improves evaluation sensitivity and reliability, effectively distinguishing the efficacy of diverse unlearning algorithms. By providing an interpretable, reproducible, and assumption-light benchmark, DCUE advances trustworthy LLM unlearning assessment.

Technology Category

Application Category

📝 Abstract
This paper analyzes the limitations of existing unlearning evaluation metrics in terms of practicality, exactness, and robustness in real-world LLM unlearning scenarios. To overcome these limitations, we propose a new metric called Distribution Correction-based Unlearning Evaluation (DCUE). It identifies core tokens and corrects distributional biases in their confidence scores using a validation set. The evaluation results are quantified using the Kolmogorov-Smirnov test. Experimental results demonstrate that DCUE overcomes the limitations of existing metrics, which also guides the design of more practical and reliable unlearning algorithms in the future.
Problem

Research questions and friction points this paper is trying to address.

Limitations of existing unlearning evaluation metrics
Proposes Distribution Correction-based Unlearning Evaluation (DCUE)
Improves practicality and reliability of unlearning algorithms
Innovation

Methods, ideas, or system contributions that make the work stand out.

Proposes DCUE for unlearning evaluation
Corrects biases using validation set
Quantifies results with Kolmogorov-Smirnov test
K
Ke Miao
The State Key Laboratory of Blockchain and Data Security, Zhejiang University
Yuke Hu
Yuke Hu
Zhejiang University
Data PrivacyTrustworthy LLMDifferential PrivacyMachine Unlearning
X
Xiaochen Li
UNC Greensboro
W
Wenjie Bao
The State Key Laboratory of Blockchain and Data Security, Zhejiang University
Z
Zhihao Liu
The State Key Laboratory of Blockchain and Data Security, Zhejiang University
Zhan Qin
Zhan Qin
Researcher, Zhejiang University
Data Security and PrivacyAI Security
Kui Ren
Kui Ren
Professor and Dean of Computer Science, Zhejiang University, ACM/IEEE Fellow
Data Security & PrivacyAI SecurityIoT & Vehicular Security