With a Little Help From My Friends: Collective Manipulation in Risk-Controlling Recommender Systems

📅 2026-03-30

📈 Citations: 0

✨ Influential: 0

career value

213K/year

🤖 AI Summary

This study addresses the vulnerability of risk-controlled recommender systems to coordinated adversarial manipulation, demonstrating that as few as 1% malicious users can degrade the nDCG for non-adversarial users by up to 20%. The paper identifies the root cause in existing mechanisms’ reliance on aggregated user feedback, which obscures individual-level anomalies. To mitigate this, the authors propose shifting from group-level to user-level risk control. Leveraging a conformal risk control framework, simulated coordinated attacks, and fine-grained user behavior analysis, the proposed approach enhances system robustness while preserving recommendation personalization. Experimental results show that the method effectively mitigates the impact of coordinated manipulation attacks without compromising utility for legitimate users.

Technology Category

Application Category

📝 Abstract

Recommendation systems have become central gatekeepers of online information, shaping user behaviour across a wide range of activities. In response, users increasingly organize and coordinate to steer algorithmic outcomes toward diverse goals, such as promoting relevant content or limiting harmful material, relying on platform affordances -- such as likes, reviews, or ratings. While these mechanisms can serve beneficial purposes, they can also be leveraged for adversarial manipulation, particularly in systems where such feedback directly informs safety guarantees. In this paper, we study this vulnerability in recently proposed risk-controlling recommender systems, which use binary user feedback (e.g., "Not Interested") to provably limit exposure to unwanted content via conformal risk control. We empirically demonstrate that their reliance on aggregate feedback signals makes them inherently susceptible to coordinated adversarial user behaviour. Using data from a large-scale online video-sharing platform, we show that a small coordinated group (comprising only 1% of the user population) can induce up to a 20% degradation in nDCG for non-adversarial users by exploiting the affordances provided by risk-controlling recommender systems. We evaluate simple, realistic attack strategies that require little to no knowledge of the underlying recommendation algorithm and find that, while coordinated users can significantly harm overall recommendation quality, they cannot selectively suppress specific content groups through reporting alone. Finally, we propose a mitigation strategy that shifts guarantees from the group level to the user level, showing empirically how it can reduce the impact of adversarial coordinated behaviour while ensuring personalized safety for individuals.

Problem

Research questions and friction points this paper is trying to address.

collective manipulation

risk-controlling recommender systems

adversarial behavior

user feedback

algorithmic vulnerability

Innovation

Methods, ideas, or system contributions that make the work stand out.

risk-controlling recommender systems

collective manipulation

conformal risk control