Pre-train and Fine-tune: Recommenders as Large Models

📅 2025-01-24
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address the dual challenges of dynamic user interest evolution and prohibitively high fine-tuning costs for large language models (LLMs) in recommender systems, this paper proposes a two-stage fine-tuning paradigm grounded in information bottleneck theory: knowledge compression followed by knowledge matching. We introduce Information-Aware Adaptive Kernel (IAK) fine-tuning—the first technique explicitly designed for recommendation—enabling interpretable and generalizable lightweight adaptation. The method balances theoretical rigor with engineering practicality, supporting billion-scale online inference deployment. Evaluated on the homepage of a leading food-delivery platform, it achieves a 2.3% offline AUC gain and statistically significant improvements in online click-through rate (CTR) and gross merchandise volume (GMV). Our framework establishes a reusable, production-ready engineering paradigm for large-scale recommendation fine-tuning.

Technology Category

Application Category

📝 Abstract
In reality, users have different interests in different periods, regions, scenes, etc. Such changes in interest are so drastic that they are difficult to be captured by recommenders. Existing multi-domain learning can alleviate this problem. However, the structure of the industrial recommendation system is complex, the amount of data is huge, and the training cost is extremely high, so it is difficult to modify the structure of the industrial recommender and re-train it. To fill this gap, we consider recommenders as large pre-trained models and fine-tune them. We first propose the theory of the information bottleneck for fine-tuning and present an explanation for the fine-tuning technique in recommenders. To tailor for recommendation, we design an information-aware adaptive kernel (IAK) technique to fine-tune the pre-trained recommender. Specifically, we define fine-tuning as two phases: knowledge compression and knowledge matching and let the training stage of IAK explicitly approximate these two phases. Our proposed approach designed from the essence of fine-tuning is well interpretable. Extensive online and offline experiments show the superiority of our proposed method. Besides, we also share unique and important lessons we learned when deploying the method in a large-scale online platform. We also present the potential issues of fine-tuning techniques in recommendation systems and the corresponding solutions. The recommender with IAK technique has been deployed on the homepage of a billion-scale online food platform for several months and has yielded considerable profits in our business.
Problem

Research questions and friction points this paper is trying to address.

Temporal-Spatial Context
Recommendation Accuracy
Model Fine-tuning Efficiency
Innovation

Methods, ideas, or system contributions that make the work stand out.

IAK
Knowledge Distillation and Matching
Adaptive Recommendation Systems
🔎 Similar Papers
Z
Zhenhao Jiang
The Chinese University of Hongkong, Shenzhen, Shenzhen, Guangdong, China
C
Chenghao Chen
Alibaba Group, Hangzhou, Zhejiang, China
H
Hao Feng
Alibaba Group, Hangzhou, Zhejiang, China
Y
Yu Yang
Alibaba Group, Hangzhou, Zhejiang, China
J
Jin Liu
Alibaba Group, Hangzhou, Zhejiang, China
J
Jie Zhang
Alibaba Group, Hangzhou, Zhejiang, China
J
Jia Jia
Alibaba Group, Hangzhou, Zhejiang, China
Ning Hu
Ning Hu
Carnegie Mellon University
Machine LearningComputer MusicMultimediaHuman-Computer Interaction