Observation-Free Attacks on Online Learning to Rank

📅 2025-09-26
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work exposes a fundamental vulnerability of online learning to rank (OLTR) algorithms under collaborative adversarial attacks: an attacker can stealthily manipulate ranking outcomes—ensuring a target item remains persistently in the top-K positions—without observing any user feedback. To this end, we propose the first “no-observation” attack paradigm, requiring only O(log T) interventions to induce linear regret and achieve T − o(T) rounds of sustained exposure. Leveraging the cascade click model, we design two tailored attack strategies—CascadeOFA and PBMOFA—targeting CascadeUCB1 and PBM-UCB, respectively. We provide rigorous theoretical guarantees of their efficacy and empirically validate them on real-world datasets, demonstrating that minimal manipulations drastically degrade recommendation quality. This is the first systematic study revealing critical security risks of OLTR in black-box, feedback-free settings, offering both a foundational warning and a benchmark for designing robust ranking algorithms.

Technology Category

Application Category

📝 Abstract
Online learning to rank (OLTR) plays a critical role in information retrieval and machine learning systems, with a wide range of applications in search engines and content recommenders. However, despite their extensive adoption, the susceptibility of OLTR algorithms to coordinated adversarial attacks remains poorly understood. In this work, we present a novel framework for attacking some of the widely used OLTR algorithms. Our framework is designed to promote a set of target items so that they appear in the list of top-K recommendations for T - o(T) rounds, while simultaneously inducing linear regret in the learning algorithm. We propose two novel attack strategies: CascadeOFA for CascadeUCB1 and PBMOFA for PBM-UCB . We provide theoretical guarantees showing that both strategies require only O(log T) manipulations to succeed. Additionally, we supplement our theoretical analysis with empirical results on real-world data.
Problem

Research questions and friction points this paper is trying to address.

Attacking online learning to rank algorithms without observation requirements
Promoting target items in top-K recommendations with minimal manipulations
Inducing linear regret in OLTR systems using logarithmic attack cost
Innovation

Methods, ideas, or system contributions that make the work stand out.

Observation-free attack framework for OLTR algorithms
Two strategies requiring only logarithmic manipulations
Promotes target items while inducing linear regret
🔎 Similar Papers
No similar papers found.