Balancing Fine-tuning and RAG: A Hybrid Strategy for Dynamic LLM Recommendation Updates

📅 2025-10-23
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address the challenge of simultaneously achieving long-term knowledge consolidation and real-time adaptation to evolving user interests and content dynamics in large language model (LLM)-based recommender systems, this paper proposes a hybrid update framework integrating periodic fine-tuning with retrieval-augmented generation (RAG). The approach consolidates domain-specific prior knowledge through scheduled fine-tuning while leveraging RAG for millisecond-level, context-aware inference—dynamically fusing historical user preferences with instantaneous behavioral signals. Its key innovations include a dual-path knowledge updating mechanism and a lightweight online adaptation strategy. Deployed on a billion-user platform, A/B testing demonstrates a 12.7% improvement in user satisfaction and a 43% reduction in model iteration cost, significantly outperforming both standalone fine-tuning and pure RAG baselines.

Technology Category

Application Category

📝 Abstract
Large Language Models (LLMs) empower recommendation systems through their advanced reasoning and planning capabilities. However, the dynamic nature of user interests and content poses a significant challenge: While initial fine-tuning aligns LLMs with domain knowledge and user preferences, it fails to capture such real-time changes, necessitating robust update mechanisms. This paper investigates strategies for updating LLM-powered recommenders, focusing on the trade-offs between ongoing fine-tuning and Retrieval-Augmented Generation (RAG). Using an LLM-powered user interest exploration system as a case study, we perform a comparative analysis of these methods across dimensions like cost, agility, and knowledge incorporation. We propose a hybrid update strategy that leverages the long-term knowledge adaptation of periodic fine-tuning with the agility of low-cost RAG. We demonstrate through live A/B experiments on a billion-user platform that this hybrid approach yields statistically significant improvements in user satisfaction, offering a practical and cost-effective framework for maintaining high-quality LLM-powered recommender systems.
Problem

Research questions and friction points this paper is trying to address.

Balancing fine-tuning and RAG for dynamic LLM recommendation updates
Addressing real-time user interest changes in LLM-powered recommenders
Developing cost-effective update strategies for recommendation system maintenance
Innovation

Methods, ideas, or system contributions that make the work stand out.

Hybrid strategy combining fine-tuning and RAG
Leverages periodic fine-tuning for long-term adaptation
Uses low-cost RAG for real-time agility
🔎 Similar Papers
No similar papers found.
C
Changping Meng
Google, Mountain View, California, USA
Hongyi Ling
Hongyi Ling
Texas A&M University
Graph Neural NetworksTrustworthy AI
Jianling Wang
Jianling Wang
Research Scientist, Google
Data miningRecommendation Systems
Y
Yifan Liu
Google, Mountain View, California, USA
S
Shuzhou Zhang
Google, Mountain View, California, USA
D
Dapeng Hong
Google, Mountain View, California, USA
M
Mingyan Gao
Google, Mountain View, California, USA
Onkar Dalal
Onkar Dalal
Stanford University
graphical modelsoptimization algorithmsmachine learningdata mining
E
Ed Chi
Google Deepmind, Mountain View, California, USA
Lichan Hong
Lichan Hong
Google DeepMind
Recommendation SystemLLMDeep LearningSocial ComputingVisualization
Haokai Lu
Haokai Lu
Google Deepmind
Machine learningRecommender systems
N
Ningren Han
Google, Mountain View, California, USA