🤖 AI Summary
Existing conversational recommendation systems (CRS) rely on predefined attributes or costly domain-specific annotations, limiting their generalizability and cross-domain adaptability. To address this, we propose a lightweight, multi-domain CRS framework that introduces a novel semantic snippet modeling paradigm grounded in user-generated content (e.g., reviews and open-ended responses), eliminating the need for pre-specified attributes or extensive manual labeling. Our method leverages large language models to perform fine-grained snippet compression and semantic alignment between reviews and dialogue contexts, followed by efficient vector-based retrieval for recommendation. Experiments across restaurant, book, and clothing domains show Hits@10 scores of 0.25–0.55—significantly outperforming document- or sentence-level representations—and demonstrate stable performance on candidate sets of 3K–10K items under open-domain user inputs. The approach substantially reduces both data curation and annotation costs.
📝 Abstract
Conversational Recommender Systems (CRS) engage users in interactive dialogues to gather preferences and provide personalized recommendations. While existing studies have advanced conversational strategies, they often rely on predefined attributes or expensive, domain-specific annotated datasets, which limits their flexibility in handling diverse user preferences and adaptability across domains. We propose SnipRec, a novel resource-efficient approach that leverages user-generated content, such as customer reviews, to capture a broader range of user expressions. By employing large language models to map reviews and user responses into concise snippets, SnipRec represents user preferences and retrieves relevant items without the need for intensive manual data collection or fine-tuning. Experiments across the restaurant, book, and clothing domains show that snippet-based representations outperform document- and sentence-based representations, achieving Hits@10 of 0.25-0.55 with 3,000 to 10,000 candidate items while successfully handling free-form user responses.