🤖 AI Summary
To address the challenge of jointly modeling graph structure, user behavioral sequences, and multimodal content signals in large-scale, multi-entity heterogeneous graphs (e.g., Pinterest), this paper proposes the first industrial-grade unified representation learning framework. Methodologically, it introduces a multi-task contrastive learning mechanism that synergistically integrates graph neural networks, sequential modeling, and multimodal content encoders; additionally, it develops a scalable distributed graph computation system enabling efficient training and inference on graphs with up to ten billion nodes. Key contributions include: (1) the first heterogeneous multi-view contrastive learning paradigm, and (2) end-to-end joint optimization of cross-modal and cross-task representations. Experiments demonstrate that the learned universal representations significantly enhance user experience—yielding a 2.5% site-wide repin rate improvement—and the framework has been deployed across five core production applications. The code will be open-sourced.
📝 Abstract
Representation learning, a task of learning latent vectors to represent entities, is a key task in improving search and recommender systems in web applications. Various representation learning methods have been developed, including graph-based approaches for relationships among entities, sequence-based methods for capturing the temporal evolution of user activities, and content-based models for leveraging text and visual content. However, the development of a unifying framework that integrates these diverse techniques to support multiple applications remains a significant challenge. This paper presents OmniSage, a large-scale representation framework that learns universal representations for a variety of applications at Pinterest. OmniSage integrates graph neural networks with content-based models and user sequence models by employing multiple contrastive learning tasks to effectively process graph data, user sequence data, and content signals. To support the training and inference of OmniSage, we developed an efficient infrastructure capable of supporting Pinterest graphs with billions of nodes. The universal representations generated by OmniSage have significantly enhanced user experiences on Pinterest, leading to an approximate 2.5% increase in sitewide repins (saves) across five applications. This paper highlights the impact of unifying representation learning methods, and we will open source the OmniSage code by the time of publication.