Towards a Real-World Aligned Benchmark for Unlearning in Recommender Systems

📅 2025-08-23

📈 Citations: 0

✨ Influential: 0

career value

220K/year

🤖 AI Summary

Existing machine unlearning benchmarks for recommender systems (e.g., CURE4Rec) overemphasize collaborative filtering while neglecting realistic tasks such as session-based and next-basket recommendation; they further assume unrealistic large-batch unlearning requests, ignoring temporal dependencies and strict latency constraints. Method: We introduce the first real-world-oriented unlearning benchmark for recommendation, covering diverse task types and domain-specific unlearning scenarios—emphasizing fine-grained, temporally ordered deletion requests and low-latency model updates. Leveraging both collaborative filtering and sequential modeling paradigms, we design lightweight, task-customized unlearning algorithms inspired by the NeurIPS Machine Unlearning Competition framework. Contribution/Results: Evaluated on next-basket recommendation, our algorithms achieve higher unlearning accuracy than generic baselines while maintaining sub-second to few-second per-request update latency—demonstrating, for the first time, efficient and practical machine unlearning in sequential recommendation models.

Technology Category

Application Category

📝 Abstract

Modern recommender systems heavily leverage user interaction data to deliver personalized experiences. However, relying on personal data presents challenges in adhering to privacy regulations, such as the GDPR's "right to be forgotten". Machine unlearning (MU) aims to address these challenges by enabling the efficient removal of specific training data from models post-training, without compromising model utility or leaving residual information. However, current benchmarks for unlearning in recommender systems -- most notably CURE4Rec -- fail to reflect real-world operational demands. They focus narrowly on collaborative filtering, overlook tasks like session-based and next-basket recommendation, simulate unrealistically large unlearning requests, and ignore critical efficiency constraints. In this paper, we propose a set of design desiderata and research questions to guide the development of a more realistic benchmark for unlearning in recommender systems, with the goal of gathering feedback from the research community. Our benchmark proposal spans multiple recommendation tasks, includes domain-specific unlearning scenarios, and several unlearning algorithms -- including ones adapted from a recent NeurIPS unlearning competition. Furthermore, we argue for an unlearning setup that reflects the sequential, time-sensitive nature of real-world deletion requests. We also present a preliminary experiment in a next-basket recommendation setting based on our proposed desiderata and find that unlearning also works for sequential recommendation models, exposed to many small unlearning requests. In this case, we observe that a modification of a custom-designed unlearning algorithm for recommender systems outperforms general unlearning algorithms significantly, and that unlearning can be executed with a latency of only several seconds.

Problem

Research questions and friction points this paper is trying to address.

Developing realistic benchmark for recommender system unlearning evaluation

Addressing limitations of current benchmarks in real-world scenarios

Ensuring efficient data removal while maintaining model utility

Innovation

Methods, ideas, or system contributions that make the work stand out.

Real-world benchmark for recommender unlearning

Multiple recommendation tasks and scenarios

Efficient algorithm with seconds latency

🔎 Similar Papers

Multi-Modal Recommendation Unlearning for Legal, Licensing, and Modality Constraints