Enhancing Rating Prediction with Off-the-Shelf LLMs Using In-Context User Reviews

📅 2025-09-30
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the cold-start problem in Likert-scale rating prediction for recommender systems. Instead of relying on traditional collaborative filtering or matrix factorization, we propose a personalized rating prediction method leveraging off-the-shelf large language models (LLMs). Our approach uses users’ authentic reviews as contextual input and employs prompt engineering and in-context learning to perform regression-based rating prediction. Crucially, we empirically find that concrete, item-specific reviews exhibit greater discriminative power than abstract preference descriptions; accordingly, we design a novel two-stage prompting strategy: first generating hypothetical reviews conditioned on user–item interactions, then predicting ratings from those reviews. Experiments across three public benchmark datasets demonstrate that our method matches or surpasses classical matrix factorization models in overall accuracy and achieves substantial improvements—particularly under cold-start conditions. To our knowledge, this is the first systematic application of pre-trained LLMs to Likert-scale rating prediction, establishing a new paradigm for low-resource recommendation.

Technology Category

Application Category

📝 Abstract
Personalizing the outputs of large language models (LLMs) to align with individual user preferences is an active research area. However, previous studies have mainly focused on classification or ranking tasks and have not considered Likert-scale rating prediction, a regression task that requires both language and mathematical reasoning to be solved effectively. This task has significant industrial applications, but the utilization of LLMs remains underexplored, particularly regarding the capabilities of off-the-shelf LLMs. This study investigates the performance of off-the-shelf LLMs on rating prediction, providing different in-context information. Through comprehensive experiments with eight models across three datasets, we demonstrate that user-written reviews significantly improve the rating prediction performance of LLMs. This result is comparable to traditional methods like matrix factorization, highlighting the potential of LLMs as a promising solution for the cold-start problem. We also find that the reviews for concrete items are more effective than general preference descriptions that are not based on any specific item. Furthermore, we discover that prompting LLMs to first generate a hypothetical review enhances the rating prediction performance. Our code is available at https://github.com/ynklab/rating-prediction-with-reviews.
Problem

Research questions and friction points this paper is trying to address.

Predicting Likert-scale ratings using off-the-shelf LLMs
Enhancing rating prediction with in-context user reviews
Addressing cold-start problem through LLM-based rating systems
Innovation

Methods, ideas, or system contributions that make the work stand out.

Using user reviews for LLM rating prediction
Comparing LLMs with traditional matrix factorization
Generating hypothetical reviews to enhance predictions
🔎 Similar Papers
No similar papers found.