🤖 AI Summary
This work identifies and mitigates positional bias in large language models (LLMs) for recommendation tasks: the ordering of candidate items in the prompt significantly influences outputs, undermining result stability and fairness. To address this, we propose RISE (Rank-Iterative Selection and Evaluation), a zero-shot, fine-tuning-free, post-processing-free iterative ranking strategy. RISE decouples recommendation decisions from input order via dynamic reordering and selective evaluation. To our knowledge, this is the first systematic study to both diagnose and alleviate LLMs’ positional sensitivity in recommendation. Extensive experiments on multiple real-world datasets demonstrate that RISE substantially reduces ordering sensitivity, improves recommendation consistency (+12.7% Kendall’s τ), and enhances robustness—achieving 3.2× greater resilience to perturbations. Our approach establishes a new paradigm for trustworthy, LLM-driven recommendation systems.
📝 Abstract
Large Language Models (LLMs) are being increasingly explored as general-purpose tools for recommendation tasks, enabling zero-shot and instruction-following capabilities without the need for task-specific training. While the research community is enthusiastically embracing LLMs, there are important caveats to directly adapting them for recommendation tasks. In this paper, we show that LLM-based recommendation models suffer from position bias, where the order of candidate items in a prompt can disproportionately influence the recommendations produced by LLMs. First, we analyse the position bias of LLM-based recommendations on real-world datasets, where results uncover systemic biases of LLMs with high sensitivity to input orders. Furthermore, we introduce a new prompting strategy to mitigate the position bias of LLM recommendation models called Ranking via Iterative SElection (RISE). We compare our proposed method against various baselines on key benchmark datasets. Experiment results show that our method reduces sensitivity to input ordering and improves stability without requiring model fine-tuning or post-processing.