🤖 AI Summary
Recommendation systems have long suffered from ID dependency, hindering transferability across platforms with no user or item overlap. To address this, we propose the first ID-free, end-to-end recommendation framework leveraging raw multimodal feedback (e.g., text and images), eliminating conventional ID-based embeddings. Our approach adopts a pretraining-and-transfer paradigm to learn modality-aware unified representations and a task-agnostic recommendation architecture. Key contributions include: (1) an ID-free modeling paradigm; (2) multimodal feature encoding coupled with contrastive cross-task representation alignment; and (3) cross-domain transfer capability without shared entities. We validate effectiveness across four real-world domains; source/target data scaling experiments demonstrate strong robustness. Our method significantly improves cold-start and cross-platform recommendation performance, establishing a new paradigm for general-purpose recommendation systems.
📝 Abstract
Learning large-scale pre-trained models on broad-ranging data and then transfer to a wide range of target tasks has become the de facto paradigm in many machine learning (ML) communities. Such big models are not only strong performers in practice but also offer a promising way to break out of the task-specific modeling restrictions, thereby enabling task-agnostic and unified ML systems. However, such a popular paradigm is mainly unexplored by the recommender systems (RS) community. A critical issue is that standard recommendation models are primarily built on categorical identity features. That is, the users and the interacted items are represented by their unique IDs, which are generally not shareable across different systems or platforms. To pursue the transferable recommendations, we propose studying pre-trained RS models in a novel scenario where a user's interaction feedback involves a mixture-of-modality (MoM) items, e.g., text and images. We then present TransRec, a very simple modification made on the popular ID-based RS framework. TransRec learns directly from the raw features of the MoM items in an end-to-end training manner and thus enables effective transfer learning under various scenarios without relying on overlapped users or items. We empirically study the transferring ability of TransRec across four different real-world recommendation settings. Besides, we look at its effects by scaling source and target data size. Our results suggest that learning neural recommendation models from MoM feedback provides a promising way to realize universal RS.