Evaluating User Experience in Conversational Recommender Systems: A Systematic Review Across Classical and LLM-Powered Approaches

📅 2025-08-04
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Current user experience (UX) evaluation of conversational recommender systems (CRS) suffers from significant limitations—scarce empirical studies, especially for adaptive and large language model (LLM)-driven CRS; outdated evaluation protocols; and neglect of affective dynamics and their interplay with adaptive behaviors. Method: Guided by PRISMA, we systematically reviewed 23 empirical UX studies (2017–2025), employing qualitative synthesis and cross-case analysis to trace conceptual, methodological, and contextual evolutions and gaps. Contribution/Results: We propose the first UX evaluation framework tailored for LLM-CRS, establishing a comparable metrics taxonomy and conducting the first systematic comparison of how adaptive mechanisms versus LLM-specific attributes differentially impact UX. Key findings reveal overreliance on retrospective questionnaires and insufficient fine-grained affective feedback. This work provides an evidence-based foundation and methodological scaffolding for developing transparent, interaction-centric, and user-centered CRS evaluation paradigms.

Technology Category

Application Category

📝 Abstract
Conversational Recommender Systems (CRSs) are receiving growing research attention across domains, yet their user experience (UX) evaluation remains limited. Existing reviews largely overlook empirical UX studies, particularly in adaptive and large language model (LLM)-based CRSs. To address this gap, we conducted a systematic review following PRISMA guidelines, synthesising 23 empirical studies published between 2017 and 2025. We analysed how UX has been conceptualised, measured, and shaped by domain, adaptivity, and LLM. Our findings reveal persistent limitations: post hoc surveys dominate, turn-level affective UX constructs are rarely assessed, and adaptive behaviours are seldom linked to UX outcomes. LLM-based CRSs introduce further challenges, including epistemic opacity and verbosity, yet evaluations infrequently address these issues. We contribute a structured synthesis of UX metrics, a comparative analysis of adaptive and nonadaptive systems, and a forward-looking agenda for LLM-aware UX evaluation. These findings support the development of more transparent, engaging, and user-centred CRS evaluation practices.
Problem

Research questions and friction points this paper is trying to address.

Evaluating user experience in conversational recommender systems
Addressing gaps in empirical UX studies for adaptive and LLM-based CRSs
Improving transparency and user-centered evaluation practices in CRSs
Innovation

Methods, ideas, or system contributions that make the work stand out.

Systematic review following PRISMA guidelines
Comparative analysis of adaptive systems
Forward-looking agenda for LLM-aware UX
🔎 Similar Papers
No similar papers found.
R
Raj Mahmud
School of Computer Science, University of Technology Sydney, Australia
Y
Yufeng Wu
School of Computer Science, University of Technology Sydney, Australia
A
Abdullah Bin Sawad
The Applied College, King Abdulaziz University, Saudi Arabia
Shlomo Berkovsky
Shlomo Berkovsky
Macquarie University
health informaticshuman-AI interactionpersonalizationuser modeling
M
Mukesh Prasad
School of Computer Science, University of Technology Sydney, Australia
A. Baki Kocaballi
A. Baki Kocaballi
Senior Lecturer, University of Technology Sydney
Interaction DesignConversational InterfacesArtificial IntelligenceHuman-AI Interaction