🤖 AI Summary
Existing recommender systems suffer from high model maintenance costs, strong task coupling, and complex feature engineering. Method: We propose the first 150B-parameter decoder-only large language model (LLM) for multi-task personalized ranking, unifying over 30 recommendation tasks on LinkedIn. By textualizing user behavior and social relationships, defining tasks via natural-language instructions, and performing multi-task instruction tuning, we eliminate traditional feature engineering and dependency graphs (e.g., DAG-based model orchestration). Contribution/Results: The approach enables zero-shot cross-task and cross-domain transfer without task-specific fine-tuning; achieves state-of-the-art or superior offline metrics across all 30+ production tasks; and significantly reduces model maintenance team size. Our core innovation is the first replacement of dozens of heterogeneous recommendation models with a single LLM—establishing a text-based, instruction-driven paradigm for general-purpose recommendation.
📝 Abstract
Ranking and recommendation systems are the foundation for numerous online experiences, ranging from search results to personalized content delivery. These systems have evolved into complex, multilayered architectures that leverage vast datasets and often incorporate thousands of predictive models. The maintenance and enhancement of these models is a labor intensive process that requires extensive feature engineering. This approach not only exacerbates technical debt but also hampers innovation in extending these systems to emerging problem domains. In this report, we present our research to address these challenges by utilizing a large foundation model with a textual interface for ranking and recommendation tasks. We illustrate several key advantages of our approach: (1) a single model can manage multiple predictive tasks involved in ranking and recommendation, (2) decoder models with textual interface due to their comprehension of reasoning capabilities, can generalize to new recommendation surfaces and out-of-domain problems, and (3) by employing natural language interfaces for task definitions and verbalizing member behaviors and their social connections, we eliminate the need for feature engineering and the maintenance of complex directed acyclic graphs of model dependencies. We introduce our research pre-production model, 360Brew V1.0, a 150B parameter, decoder-only model that has been trained and fine-tuned on LinkedIn's data and tasks. This model is capable of solving over 30 predictive tasks across various segments of the LinkedIn platform, achieving performance levels comparable to or exceeding those of current production systems based on offline metrics, without task-specific fine-tuning. Notably, each of these tasks is conventionally addressed by dedicated models that have been developed and maintained over multiple years by teams of a similar or larger size than our own.