π€ AI Summary
This study addresses the longstanding challenge in recommender systems of balancing diversity and personalization. The authors propose a large language modelβbased multi-agent architecture for movie recommendations and conduct a between-subjects experiment with over 100 participants, integrating psychological scales and standard recommendation metrics to examine how single-agent versus multi-agent designs influence usersβ perceived novelty, diversity, and accuracy. Results indicate that the multi-agent system significantly enhances perceived novelty and Shannon diversity. Furthermore, user conscientiousness positively predicts both recommendation accuracy and perceived diversity, whereas extraversion negatively correlates with diversity perception. Experience with generative AI positively impacts diversity outcomes, while skepticism toward GenAI negatively affects them. These findings underscore the moderating role of individual differences in multi-agent recommendation effectiveness and highlight the need to move beyond one-size-fits-all approaches toward personalized conversational recommender systems.
π Abstract
Diversity is an important evaluation criterion for recommender systems beyond accuracy, yet users differ in their willingness to engage with novel and diverse content. In this work, we investigate how a Large Language Model (LLM)-based multi-agent system supports users' exploration of diverse recommendations, and how individual characteristics shape user experiences. We conducted a between-subjects user study (N = 100) comparing a single-agent system (baseline) with a multi-agent system for movie recommendations. We measured Perceived Accuracy, diversity, novelty, and overall rating, and examined the influence of personal characteristics, including personality traits, demographics, GenAI recommendation experience, and GenAI skepticism. Results show that the multi-agent system significantly increases Perceived Novelty and Shannon Diversity. Conscientiousness is positively associated with Perceived Accuracy and diversity, whereas extraversion is negatively associated with Perceived Diversity. Prior experience with GenAI-based recommendations is positively associated with Shannon Diversity, while skepticism toward GenAI is negatively associated with it. We also observe significant interaction effects between system design and user characteristics. These findings highlight the importance of personality-aware conversational recommender systems and caution against one-size-fits-all multi-agent designs.