Biases in LLM-Generated Musical Taste Profiles for Recommendation

๐Ÿ“… 2025-07-22
๐Ÿ“ˆ Citations: 0
โœจ Influential: 0
๐Ÿ“„ PDF
๐Ÿค– AI Summary
This study investigates whether natural language (NL) music taste profiles automatically generated by large language models (LLMs) are perceived as accurate by users, and whether systematic biases exist with respect to user attributes (e.g., mainstreamness, taste diversity) and item characteristics (e.g., genre, country of origin). Leveraging listening histories to generate NL profiles, we conduct a large-scale user survey and evaluate downstream recommendation performance to establish, for the first time, a joint analysis of profile endorsement and recommendation fairness. Results reveal that users with higher mainstreamness and lower taste diversity exhibit significantly greater endorsement; conversely, non-Western and niche-genre items substantially reduce endorsementโ€”and this bias persists in recommendation accuracy and coverage. Our work uncovers implicit cultural and cognitive biases embedded in LLM-driven explainable recommendation systems, providing novel empirical evidence and a methodological framework for designing fairer, more trustworthy personalized recommender systems.

Technology Category

Application Category

๐Ÿ“ Abstract
One particularly promising use case of Large Language Models (LLMs) for recommendation is the automatic generation of Natural Language (NL) user taste profiles from consumption data. These profiles offer interpretable and editable alternatives to opaque collaborative filtering representations, enabling greater transparency and user control. However, it remains unclear whether users consider these profiles to be an accurate representation of their taste, which is crucial for trust and usability. Moreover, because LLMs inherit societal and data-driven biases, profile quality may systematically vary across user and item characteristics. In this paper, we study this issue in the context of music streaming, where personalization is challenged by a large and culturally diverse catalog. We conduct a user study in which participants rate NL profiles generated from their own listening histories. We analyze whether identification with the profiles is biased by user attributes (e.g., mainstreamness, taste diversity) and item features (e.g., genre, country of origin). We also compare these patterns to those observed when using the profiles in a downstream recommendation task. Our findings highlight both the potential and limitations of scrutable, LLM-based profiling in personalized systems.
Problem

Research questions and friction points this paper is trying to address.

Assessing accuracy of LLM-generated music taste profiles for users
Investigating biases in profiles across user and item attributes
Evaluating profile performance in downstream recommendation tasks
Innovation

Methods, ideas, or system contributions that make the work stand out.

Generates NL user taste profiles from data
Analyzes biases in profiles via user study
Compares profile biases in recommendation tasks
๐Ÿ”Ž Similar Papers
No similar papers found.