Uncertainty and Fairness Awareness in LLM-Based Recommendation Systems

📅 2026-01-31
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the reliability and fairness challenges in large language model (LLM)-based recommender systems, which are often undermined by prediction uncertainty and embedded biases. We propose the first fairness benchmark integrating user personality profiles alongside an uncertainty-aware evaluation framework. Prediction uncertainty is quantified via entropy, while group fairness disparities are measured using novel metrics—Similarity-Normalized Statistical Ratio (SNSR) and its variance (SNSV). Through prompt perturbation experiments on multilingual datasets annotated with demographic attributes, we reveal that Gemini 1.5 Flash exhibits significant and robust unfairness with respect to sensitive attributes (SNSR = 0.1363, SNSV = 0.0507). Our framework effectively characterizes the trade-off between personalization and fairness, advancing the development of trustworthy and interpretable recommendation systems.

Technology Category

Application Category

📝 Abstract
Large language models (LLMs) enable powerful zero-shot recommendations by leveraging broad contextual knowledge, yet predictive uncertainty and embedded biases threaten reliability and fairness. This paper studies how uncertainty and fairness evaluations affect the accuracy, consistency, and trustworthiness of LLM-generated recommendations. We introduce a benchmark of curated metrics and a dataset annotated for eight demographic attributes (31 categorical values) across two domains: movies and music. Through in-depth case studies, we quantify predictive uncertainty (via entropy) and demonstrate that Google DeepMind's Gemini 1.5 Flash exhibits systematic unfairness for certain sensitive attributes; measured similarity-based gaps are SNSR at 0.1363 and SNSV at 0.0507. These disparities persist under prompt perturbations such as typographical errors and multilingual inputs. We further integrate personality-aware fairness into the RecLLM evaluation pipeline to reveal personality-linked bias patterns and expose trade-offs between personalization and group fairness. We propose a novel uncertainty-aware evaluation methodology for RecLLMs, present empirical insights from deep uncertainty case studies, and introduce a personality profile-informed fairness benchmark that advances explainability and equity in LLM recommendations. Together, these contributions establish a foundation for safer, more interpretable RecLLMs and motivate future work on multi-model benchmarks and adaptive calibration for trustworthy deployment.
Problem

Research questions and friction points this paper is trying to address.

uncertainty
fairness
LLM-based recommendation
bias
trustworthiness
Innovation

Methods, ideas, or system contributions that make the work stand out.

Uncertainty-aware evaluation
Fairness-aware recommendation
Personality-informed fairness
RecLLM benchmark
Predictive entropy
🔎 Similar Papers
No similar papers found.