๐ค AI Summary
Existing recommender systems overly rely on item ID embeddings while neglecting rich textual semantics, resulting in poor generalization and robustness. To address this, we propose LEARN, the first framework to integrate a frozen large language model (LLM) as a lightweight, fixed-text encoderโthereby infusing open-world linguistic knowledge into collaborative filtering signals. LEARN employs a dual-tower architecture to jointly model ID-based and text-based representations, and is specifically optimized for low-latency inference in industrial settings. Evaluated on six Amazon Review benchmarks, LEARN achieves state-of-the-art performance. Moreover, on large-scale production data and online A/B tests, it delivers significant improvements: +4.2% in click-through rate (CTR) and +3.8% in conversion rate (CVR), while maintaining high computational efficiency and practical deployability.
๐ Abstract
Contemporary recommendation systems predominantly rely on ID embedding to capture latent associations among users and items. However, this approach overlooks the wealth of semantic information embedded within textual descriptions of items, leading to suboptimal performance and poor generalizations. Leveraging the capability of large language models to comprehend and reason about textual content presents a promising avenue for advancing recommendation systems. To achieve this, we propose an Llm-driven knowlEdge Adaptive RecommeNdation (LEARN) framework that synergizes open-world knowledge with collaborative knowledge. We address computational complexity concerns by utilizing pretrained LLMs as item encoders and freezing LLM parameters to avoid catastrophic forgetting and preserve open-world knowledge. To bridge the gap between the open-world and collaborative domains, we design a twin-tower structure supervised by the recommendation task and tailored for practical industrial application. Through experiments on the real large-scale industrial dataset and online A/B tests, we demonstrate the efficacy of our approach in industry application. We also achieve state-of-the-art performance on six Amazon Review datasets to verify the superiority of our method.