🤖 AI Summary
Existing e-commerce user representations predominantly rely on black-box implicit embeddings, suffering from poor interpretability and limited integration of external knowledge—thereby hindering customer segmentation, search navigation, and recommendation performance. To address this, we propose *customer personas*: an interpretable, multidimensional, human-readable explicit user representation paradigm (e.g., “value-conscious shopper” or “busy parent”). We introduce the GPLR framework, which synergistically combines large language model–driven reasoning with graph random walks to enable scalable persona generation. Furthermore, we design RevAff—a bounded-error approximation algorithm that guarantees absolute error bounds while significantly accelerating computation. Extensive evaluation across three real-world e-commerce datasets demonstrates that integrating customer personas into state-of-the-art graph convolutional recommendation models yields up to a 12% improvement in NDCG@K and F1-Score@K. The approach exhibits strong robustness and cross-task generalization capability.
📝 Abstract
In e-commerce, user representations are essential for various applications. Existing methods often use deep learning techniques to convert customer behaviors into implicit embeddings. However, these embeddings are difficult to understand and integrate with external knowledge, limiting the effectiveness of applications such as customer segmentation, search navigation, and product recommendations. To address this, our paper introduces the concept of the customer persona. Condensed from a customer's numerous purchasing histories, a customer persona provides a multi-faceted and human-readable characterization of specific purchase behaviors and preferences, such as Busy Parents or Bargain Hunters. This work then focuses on representing each customer by multiple personas from a predefined set, achieving readable and informative explicit user representations. To this end, we propose an effective and efficient solution GPLR. To ensure effectiveness, GPLR leverages pre-trained LLMs to infer personas for customers. To reduce overhead, GPLR applies LLM-based labeling to only a fraction of users and utilizes a random walk technique to predict personas for the remaining customers. We further propose RevAff, which provides an absolute error $epsilon$ guarantee while improving the time complexity of the exact solution by a factor of at least $O(frac{epsiloncdot|E|N}{|E|+Nlog N})$, where $N$ represents the number of customers and products, and $E$ represents the interactions between them. We evaluate the performance of our persona-based representation in terms of accuracy and robustness for recommendation and customer segmentation tasks using three real-world e-commerce datasets. Most notably, we find that integrating customer persona representations improves the state-of-the-art graph convolution-based recommendation model by up to 12% in terms of NDCG@K and F1-Score@K.