Prompt-Unknown Promotion Attacks against LLM-based Sequential Recommender Systems

📅 2026-04-26

📈 Citations: 0

✨ Influential: 0

career value

230K/year

🤖 AI Summary

This work addresses the challenge of promoting items in large language model (LLM)-driven sequential recommendation systems under a fully black-box setting, where neither the model nor the prompting mechanism is known. To this end, the authors propose PUDA, a dual-pollution attack framework that leverages evolutionary strategies to infer discrete system prompts and construct a surrogate model. By semantically constrained textual manipulation of target items and injection of high-confidence polluted sequences, PUDA effectively boosts the exposure ranking of niche items. As the first method to achieve highly effective attacks without any prior knowledge of the model or prompts, PUDA substantially outperforms existing approaches on real-world datasets, revealing critical security vulnerabilities in LLM-based recommender systems even under strong privacy-preserving assumptions.

Technology Category

Application Category

📝 Abstract

Large language model-powered sequential recommender systems (LLM-SRSs) have recently demonstrated remarkable performance, enabling recommendations through prompt-driven inference over user interaction sequences. However, this paradigm also introduces new security vulnerabilities, particularly text-level manipulations, rendering them appealing targets for promotion attacks that purposely boost the ranking of specific target items. Although such security risks have been receiving increasing attention, existing studies typically rely on an unrealistic assumption of access to either the victim model or prompt to unveil attack mechanisms. In this work, we investigate the item promotion attack in LLM-SRSs under a more realistic setting where both the system prompt and victim model are unknown to the attacker, and propose a Prompt-Unknown Dual-poisoning Attack (PUDA) framework. To simulate attacks under this full black-box setting, we introduce an LLM-based evolutionary refinement strategy that infers discrete system prompts, enabling the training of an effective surrogate model that mimics the behaviors of the victim model. Leveraging the distilled prompt and surrogate model, we devise a promotion attack that adversarially revises target item texts under semantic constraints, which is further complemented by the highly plausible, surrogate-generated poisoning sequences to enable cost-effective target item promotion. Extensive experiments on real-world datasets demonstrate that PUDA consistently outperforms state-of-the-art competitors in boosting the exposure of unpopular target items. Our findings reveal critical security risks in modern LLM-SRSs even when both prompts and models are protected, and highlight the need for more robust defensive means.

Problem

Research questions and friction points this paper is trying to address.

LLM-based sequential recommender systems

promotion attacks

prompt-unknown

black-box attack

security vulnerabilities

Innovation

Methods, ideas, or system contributions that make the work stand out.

prompt inference

black-box attack

surrogate model