Enhancing Rating Prediction with Off-the-Shelf LLMs Using In-Context User Reviews

📅 2025-09-30

📈 Citations: 0

✨ Influential: 0

🤖 AI Summary

This work addresses the cold-start problem in Likert-scale rating prediction for recommender systems. Instead of relying on traditional collaborative filtering or matrix factorization, we propose a personalized rating prediction method leveraging off-the-shelf large language models (LLMs). Our approach uses users’ authentic reviews as contextual input and employs prompt engineering and in-context learning to perform regression-based rating prediction. Crucially, we empirically find that concrete, item-specific reviews exhibit greater discriminative power than abstract preference descriptions; accordingly, we design a novel two-stage prompting strategy: first generating hypothetical reviews conditioned on user–item interactions, then predicting ratings from those reviews. Experiments across three public benchmark datasets demonstrate that our method matches or surpasses classical matrix factorization models in overall accuracy and achieves substantial improvements—particularly under cold-start conditions. To our knowledge, this is the first systematic application of pre-trained LLMs to Likert-scale rating prediction, establishing a new paradigm for low-resource recommendation.

Technology Category

Application Category

📝 Abstract

Personalizing the outputs of large language models (LLMs) to align with individual user preferences is an active research area. However, previous studies have mainly focused on classification or ranking tasks and have not considered Likert-scale rating prediction, a regression task that requires both language and mathematical reasoning to be solved effectively. This task has significant industrial applications, but the utilization of LLMs remains underexplored, particularly regarding the capabilities of off-the-shelf LLMs. This study investigates the performance of off-the-shelf LLMs on rating prediction, providing different in-context information. Through comprehensive experiments with eight models across three datasets, we demonstrate that user-written reviews significantly improve the rating prediction performance of LLMs. This result is comparable to traditional methods like matrix factorization, highlighting the potential of LLMs as a promising solution for the cold-start problem. We also find that the reviews for concrete items are more effective than general preference descriptions that are not based on any specific item. Furthermore, we discover that prompting LLMs to first generate a hypothetical review enhances the rating prediction performance. Our code is available at https://github.com/ynklab/rating-prediction-with-reviews.

Problem

Research questions and friction points this paper is trying to address.

Predicting Likert-scale ratings using off-the-shelf LLMs

Enhancing rating prediction with in-context user reviews

Addressing cold-start problem through LLM-based rating systems

Innovation

Methods, ideas, or system contributions that make the work stand out.

Using user reviews for LLM rating prediction

Comparing LLMs with traditional matrix factorization

Generating hypothetical reviews to enhance predictions

🔎 Similar Papers

No similar papers found.

Authors to Follow