Do Large Language Models Favor Recent Content? A Study on Recency Bias in LLM-Based Reranking

📅 2025-09-14

📈 Citations: 0

✨ Influential: 0

career value

172K/year

🤖 AI Summary

This work identifies a significant recency bias in large language models (LLMs) for information retrieval re-ranking: LLMs implicitly favor documents with more recent publication timestamps. Using the TREC DL21/DL22 benchmarks, we systematically quantify this bias via controlled synthetic timestamp injection in both listwise re-ranking and pairwise preference experiments. Our results demonstrate that even state-of-the-art LLMs fail to mitigate this effect—on average, the most recent passage is promoted by 4.78 years in temporal offset, achieves single-step rank gains of up to 95 positions, and induces pairwise preference reversals in 25% of cases. To our knowledge, this is the first study to empirically establish and rigorously measure recency bias in LLM-based re-ranking. The findings provide critical evidence and a reproducible evaluation framework for developing temporally fair retrieval systems.

Technology Category

Application Category

📝 Abstract

Large language models (LLMs) are increasingly deployed in information systems, including being used as second-stage rerankers in information retrieval pipelines, yet their susceptibility to recency bias has received little attention. We investigate whether LLMs implicitly favour newer documents by prepending artificial publication dates to passages in the TREC Deep Learning passage retrieval collections in 2021 (DL21) and 2022 (DL22). Across seven models, GPT-3.5-turbo, GPT-4o, GPT-4, LLaMA-3 8B/70B, and Qwen-2.5 7B/72B, "fresh" passages are consistently promoted, shifting the Top-10's mean publication year forward by up to 4.78 years and moving individual items by as many as 95 ranks in our listwise reranking experiments. Although larger models attenuate the effect, none eliminate it. We also observe that the preference of LLMs between two passages with an identical relevance level can be reversed by up to 25% on average after date injection in our pairwise preference experiments. These findings provide quantitative evidence of a pervasive recency bias in LLMs and highlight the importance of effective bias-mitigation strategies.

Problem

Research questions and friction points this paper is trying to address.

Investigating recency bias in LLM-based document reranking systems

Measuring how publication dates influence LLM ranking preferences

Quantifying systematic favoritism toward newer content across models

Innovation

Methods, ideas, or system contributions that make the work stand out.

Prepending artificial publication dates to passages

Conducting listwise reranking experiments across models

Performing pairwise preference tests with date injection

🔎 Similar Papers

No similar papers found.