AgenticRecTune: Multi-Agent with Self-Evolving Skillhub for Recommendation System Optimization

📅 2026-04-21

📈 Citations: 0

✨ Influential: 0

career value

226K/year

🤖 AI Summary

This work addresses the challenges of configuration optimization in large-scale recommender systems, particularly the difficulties in tuning parameters across multi-stage pipelines, the high cost of manual hyperparameter adjustment, and misaligned objectives between stages. To this end, we propose the first multi-agent collaborative framework specifically designed for recommender system configuration optimization. The framework comprises five agents—Actor, Critic, Insight, Skill, and Online—and leverages the Gemini large language model to enable end-to-end optimization. It further introduces a self-evolving Skillhub mechanism that continuously distills task-specific mechanisms and strategic skills, establishing a closed-loop process for knowledge accumulation and iterative refinement. Experimental results demonstrate that our approach substantially reduces manual tuning effort while achieving balanced performance gains across multiple online metrics.

📝 Abstract

Modern large-scale recommendation systems are typically constructed as multi-stage pipelines, encompassing pre-ranking, ranking, and re-ranking phases. While traditional recommendation research typically focuses on optimizing a specific model, such as improving the pre-ranking model structure or ranking models training algorithm, system-level configurations optimization play a crucial role, which integrates the output from each model head to get the final score in each stage. Due to the complexity of the system, the configuration optimization is highly important and challenging. Any model modification requires new optimal system-level configurations. But each experimental iteration requires significant tuning effort. Furthermore, models in different stage operates within a distinct context and optimizes for different targets, requiring specialized domain expertise. In addition, optimization success depends on balancing competing multiple online metrics and alignment with shifting production development objectives. To address these challenges, we propose AgenticRecTune, an agentic framework comprising five specialized agents, Actor, Critic, Insight, Skill, and Online, designed to manage the end-to-end configuration optimization workflow. By leveraging the advanced reasoning of Large Language Models (LLMs), specifically Gemini, AgenticRecTune explore the optimal configuration spaces. The Actor Agent proposes multiple candidates and Critic Agent filters out suboptimal proposals.Then Online Agent autonomously prepares A/B tests based on the proposed configurations set from the Critic Agent and captures the subsequencet experimental results. We also introduce a self-evolving Skillhub, which utilizes a collaboration between the Insight Agent and Skill Agent to summarize the history results, extract underlying mechanics of each task in recommendation system and update skills.

Problem

Research questions and friction points this paper is trying to address.

recommendation system

system-level configuration optimization

multi-stage pipeline

online metrics balancing

domain expertise

Innovation

Methods, ideas, or system contributions that make the work stand out.

Multi-Agent System

Self-Evolving Skillhub

Large Language Models