Do Large Language Models Favor Recent Content? A Study on Recency Bias in LLM-Based Reranking

πŸ“… 2025-09-14
πŸ“ˆ Citations: 0
✨ Influential: 0
πŸ“„ PDF
πŸ€– AI Summary
This work identifies a significant recency bias in large language models (LLMs) for information retrieval re-ranking: LLMs implicitly favor documents with more recent publication timestamps. Using the TREC DL21/DL22 benchmarks, we systematically quantify this bias via controlled synthetic timestamp injection in both listwise re-ranking and pairwise preference experiments. Our results demonstrate that even state-of-the-art LLMs fail to mitigate this effectβ€”on average, the most recent passage is promoted by 4.78 years in temporal offset, achieves single-step rank gains of up to 95 positions, and induces pairwise preference reversals in 25% of cases. To our knowledge, this is the first study to empirically establish and rigorously measure recency bias in LLM-based re-ranking. The findings provide critical evidence and a reproducible evaluation framework for developing temporally fair retrieval systems.

Technology Category

Application Category

πŸ“ Abstract
Large language models (LLMs) are increasingly deployed in information systems, including being used as second-stage rerankers in information retrieval pipelines, yet their susceptibility to recency bias has received little attention. We investigate whether LLMs implicitly favour newer documents by prepending artificial publication dates to passages in the TREC Deep Learning passage retrieval collections in 2021 (DL21) and 2022 (DL22). Across seven models, GPT-3.5-turbo, GPT-4o, GPT-4, LLaMA-3 8B/70B, and Qwen-2.5 7B/72B, "fresh" passages are consistently promoted, shifting the Top-10's mean publication year forward by up to 4.78 years and moving individual items by as many as 95 ranks in our listwise reranking experiments. Although larger models attenuate the effect, none eliminate it. We also observe that the preference of LLMs between two passages with an identical relevance level can be reversed by up to 25% on average after date injection in our pairwise preference experiments. These findings provide quantitative evidence of a pervasive recency bias in LLMs and highlight the importance of effective bias-mitigation strategies.
Problem

Research questions and friction points this paper is trying to address.

Investigating recency bias in LLM-based document reranking systems
Measuring how publication dates influence LLM ranking preferences
Quantifying systematic favoritism toward newer content across models
Innovation

Methods, ideas, or system contributions that make the work stand out.

Prepending artificial publication dates to passages
Conducting listwise reranking experiments across models
Performing pairwise preference tests with date injection
πŸ”Ž Similar Papers
No similar papers found.
H
Hanpei Fang
Waseda University, Tokyo, Japan
S
Sijie Tao
Waseda University, Tokyo, Japan
N
Nuo Chen
The Hong Kong Polytechnic University, Hong Kong, P.R.C.
K
Kai-Xin Chang
Waseda University, Tokyo, Japan
Tetsuya Sakai
Tetsuya Sakai
Waseda University
information retrievalinteractionnatural language processingsocial good