๐ค AI Summary
To address the distribution mismatch between synthetically generated and real user behavioral dataโcaused by random sampling in model extraction attacks (MEAs)โwhich undermines attack efficacy, this paper proposes, for the first time, an LLM-based human-like data generation paradigm. It leverages large language models to simulate user preferences and iteratively generates high-quality, semantically coherent, and temporally consistent training sequences via black-box interaction with the target sequential recommendation system. This approach significantly narrows the gap between synthetic and real-world data distributions, achieving an average 44.82% improvement in attack success rate and reducing the Wasserstein distance by up to 64.98%. Furthermore, the study identifies critical dependencies of attack robustness on recommendation system hyperparameters and accordingly designs targeted defensive strategies.
๐ Abstract
Recent studies have demonstrated the vulnerability of sequential recommender systems to Model Extraction Attacks (MEAs). MEAs collect responses from recommender systems to replicate their functionality, enabling unauthorized deployments and posing critical privacy and security risks. Black-box attacks in prior MEAs are ineffective at exposing recommender system vulnerabilities due to random sampling in data selection, which leads to misaligned synthetic and real-world distributions. To overcome this limitation, we propose LLM4MEA, a novel model extraction method that leverages Large Language Models (LLMs) as human-like rankers to generate data. It generates data through interactions between the LLM ranker and target recommender system. In each interaction, the LLM ranker analyzes historical interactions to understand user behavior, and selects items from recommendations with consistent preferences to extend the interaction history, which serves as training data for MEA. Extensive experiments demonstrate that LLM4MEA significantly outperforms existing approaches in data quality and attack performance, reducing the divergence between synthetic and real-world data by up to 64.98% and improving MEA performance by 44.82% on average. From a defensive perspective, we propose a simple yet effective defense strategy and identify key hyperparameters of recommender systems that can mitigate the risk of MEAs.