Condensing Pre-Augmented Recommendation Data via Lightweight Policy Gradient Estimation

📅 2023-10-02
🏛️ IEEE Transactions on Knowledge and Data Engineering
📈 Citations: 3
Influential: 0
📄 PDF
🤖 AI Summary
Recommender systems face a critical bottleneck in training efficiency due to large-scale user–item interaction data. Existing data compression methods struggle to simultaneously preserve the discrete interaction structure and capture users’ latent preferences. To address this, we propose DConRec, a lightweight data condensation framework tailored for recommendation. DConRec introduces a novel pre-augmented condensation paradigm specifically designed for recommendation tasks: it models discrete interactions probabilistically, incorporates a pre-augmentation module to enhance synthetic data quality, and employs a lightweight policy gradient estimator to accelerate differentiable data synthesis optimization. We provide theoretical convergence guarantees for the algorithm. Extensive experiments on multiple real-world datasets demonstrate that DConRec achieves over 98% of the original model performance using only 0.5% of the raw interaction data, while accelerating training by 3.2×—significantly alleviating the computational overhead of recommender model training.
📝 Abstract
Training recommendation models on large datasets requires significant time and resources. It is desired to construct concise yet informative datasets for efficient training. Recent advances in dataset condensation show promise in addressing this problem by synthesizing small datasets. However, applying existing methods of dataset condensation to recommendation has limitations: (1) they fail to generate discrete user-item interactions, and (2) they could not preserve users’ potential preferences. To address the limitations, we propose a lightweight condensation framework tailored for recommendation (DConRec), focusing on condensing user-item historical interaction sets. Specifically, we model the discrete user-item interactions via a probabilistic approach and design a pre-augmentation module to incorporate the potential preferences of users into the condensed datasets. While the substantial size of datasets leads to costly optimization, we propose a lightweight policy gradient estimation to accelerate the data synthesis. Experimental results on multiple real-world datasets have demonstrated the effectiveness and efficiency of our framework. Besides, we provide a theoretical analysis of the provable convergence of DConRec.
Problem

Research questions and friction points this paper is trying to address.

Condensing large recommendation datasets for efficient training
Generating discrete user-item interactions in condensed datasets
Preserving users' potential preferences in condensed datasets
Innovation

Methods, ideas, or system contributions that make the work stand out.

Probabilistic modeling of user-item interactions
Pre-augmentation for preserving user preferences
Lightweight policy gradient for efficient synthesis
🔎 Similar Papers
No similar papers found.
Jiahao Wu
Jiahao Wu
The Chinese University of Hong Kong
Medical RobotsRobot-assisted MicrosurgeryMotion Planning
W
Wenqi Fan
Jingfan Chen
Jingfan Chen
The Hong Kong Polytechnic University
AgentLarge Language ModelGraph Neural NetworksRecommender Systems
Shengcai Liu
Shengcai Liu
Southern University of Science and Technology
Learn to OptimizeLLM+Optimization
Q
Qijiong Liu
R
Rui He
the Department of Computer Science and Technology, Tongji University, Shanghai, China
Q
Qing Li
K
Ke Tang
the Department of Computer Science and Technology, Tongji University, Shanghai, China