Self-Evolving Recommendation System: End-To-End Autonomous Model Optimization With LLM Agents

📅 2026-02-10

📈 Citations: 0

✨ Influential: 0

career value

251K/year

🤖 AI Summary

This work addresses the challenges of large-scale recommendation system optimization, where the vast hyperparameter space and inefficient manual iteration hinder progress. We propose the first end-to-end autonomous evolution system powered by the Gemini large language model, featuring a two-layer agent architecture: an offline Inner Loop and an online Outer Loop. This framework automatically designs optimization algorithms, model architectures, and long-term user reward functions. By integrating proxy metrics with North Star business metrics, the system enables high-throughput hypothesis generation and closed-loop validation in real-world deployment. Evaluated in YouTube’s production environment, our approach successfully deployed multiple improved models, significantly outperforming traditional human-driven workflows in both performance and iteration speed, thereby demonstrating—for the first time—the feasibility and effectiveness of LLM agents as automated machine learning engineers.

Technology Category

Application Category

📝 Abstract

Optimizing large-scale machine learning systems, such as recommendation models for global video platforms, requires navigating a massive hyperparameter search space and, more critically, designing sophisticated optimizers, architectures, and reward functions to capture nuanced user behaviors. Achieving substantial improvements in these areas is a non-trivial task, traditionally relying on extensive manual iterations to test new hypotheses. We propose a self-evolving system that leverages Large Language Models (LLMs), specifically those from Google's Gemini family, to autonomously generate, train, and deploy high-performing, complex model changes within an end-to-end automated workflow. The self-evolving system is comprised of an Offline Agent (Inner Loop) that performs high-throughput hypothesis generation using proxy metrics, and an Online Agent (Outer Loop) that validates candidates against delayed north star business metrics in live production. Our agents act as specialized Machine Learning Engineers (MLEs): they exhibit deep reasoning capabilities, discovering novel improvements in optimization algorithms and model architecture, and formulating innovative reward functions that target long-term user engagement. The effectiveness of this approach is demonstrated through several successful production launches at YouTube, confirming that autonomous, LLM-driven evolution can surpass traditional engineering workflows in both development velocity and model performance.

Problem

Research questions and friction points this paper is trying to address.

recommendation system

hyperparameter optimization

model architecture

reward function

large-scale machine learning

Innovation

Methods, ideas, or system contributions that make the work stand out.

Self-Evolving System

LLM Agents

Autonomous Model Optimization