REGEN: A Dataset and Benchmarks with Natural Language Critiques and Narratives

📅 2025-03-14
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Existing recommendation benchmarks predominantly focus on sequential prediction, neglecting users’ critical interactions and narrative explanations. To address this limitation, we propose REGEN—a novel dialogue-based recommendation dataset and paradigm—constructed by augmenting Amazon reviews with user-critical questions and context-aware recommendation narratives. REGEN establishes the first benchmark supporting joint generation of recommended items and natural-language explanations. Methodologically, we introduce LUMEN, the first unified multi-task framework integrating critical question generation, item retrieval, and narrative generation; it synergistically combines LLM-driven inpainting, conditional generation, and sequence modeling. Experiments demonstrate that incorporating critical user input significantly improves recommendation accuracy. LUMEN achieves state-of-the-art performance in both recommendation quality and narrative coherence, effectively unifying interpretability, conversational capability, and recommendation efficacy.

Technology Category

Application Category

📝 Abstract
This paper introduces a novel dataset REGEN (Reviews Enhanced with GEnerative Narratives), designed to benchmark the conversational capabilities of recommender Large Language Models (LLMs), addressing the limitations of existing datasets that primarily focus on sequential item prediction. REGEN extends the Amazon Product Reviews dataset by inpainting two key natural language features: (1) user critiques, representing user"steering"queries that lead to the selection of a subsequent item, and (2) narratives, rich textual outputs associated with each recommended item taking into account prior context. The narratives include product endorsements, purchase explanations, and summaries of user preferences. Further, we establish an end-to-end modeling benchmark for the task of conversational recommendation, where models are trained to generate both recommendations and corresponding narratives conditioned on user history (items and critiques). For this joint task, we introduce a modeling framework LUMEN (LLM-based Unified Multi-task Model with Critiques, Recommendations, and Narratives) which uses an LLM as a backbone for critiquing, retrieval and generation. We also evaluate the dataset's quality using standard auto-rating techniques and benchmark it by training both traditional and LLM-based recommender models. Our results demonstrate that incorporating critiques enhances recommendation quality by enabling the recommender to learn language understanding and integrate it with recommendation signals. Furthermore, LLMs trained on our dataset effectively generate both recommendations and contextual narratives, achieving performance comparable to state-of-the-art recommenders and language models.
Problem

Research questions and friction points this paper is trying to address.

Benchmark conversational capabilities of recommender LLMs.
Extend dataset with user critiques and narratives.
Enhance recommendation quality through language understanding.
Innovation

Methods, ideas, or system contributions that make the work stand out.

REGEN dataset enhances Amazon Reviews with critiques and narratives.
LUMEN framework integrates LLMs for multi-task conversational recommendations.
Incorporating critiques improves recommendation quality and language understanding.
🔎 Similar Papers
No similar papers found.
Kun Su
Kun Su
Google Research
Multimodal LearningAudio/Music GenerationRecommendation system
K
Krishna Sayana
Google Research, Mountain View, CA, USA
H
Hubert Pham
Google Research, Mountain View, CA, USA
J
James Pine
Google Research, Mountain View, CA, USA
Y
Yuri Vasilevski
Google Research, Mountain View, CA, USA
R
Raghavendra Vasudeva
Google Research, Mountain View, CA, USA
M
Marialena Kyriakidi
Google Research, Mountain View, CA, USA
L
Liam Hebert
University of Waterloo, Waterloo, Ontario, Canada
A
Ambarish Jash
Google Research, Mountain View, CA, USA
A
Anushya Subbiah
Google Research, Mountain View, CA, USA
S
Sukhdeep Sodhi
Google Research, Mountain View, CA, USA