🤖 AI Summary
Traditional value model weight tuning in industrial recommendation systems suffers from excessively long optimization cycles—ranging from weeks to months—rendering it incompatible with hour-level iteration requirements. Method: This paper proposes the first end-to-end, feedback-driven automatic tuning framework tailored for recommendation systems that operates on an hourly timescale. It integrates Bayesian optimization, online incremental learning, and lightweight A/B feedback modeling to establish a closed-loop tuning pipeline supporting dynamic business constraints and multi-objective trade-offs. Contribution/Results: Deployed in production, the framework reduces tuning convergence time from monthly to under 48 hours, improves composite value (CTR and watch time) by 12.7%, and reliably serves over one billion daily requests. This work breaks the critical trade-off between tuning timeliness and accuracy, delivering a scalable, production-ready paradigm for automated value model optimization in industrial recommender systems.
📝 Abstract
Modern recommendation systems can be broadly divided into two key stages: the ranking stage, where the system predicts various user engagements (e.g., click-through rate, like rate, follow rate, watch time), and the value model stage, which aggregates these predictive scores through a function (e.g., a linear combination defined by a weight vector) to measure the value of each content by a single numerical score. Both stages play roughly equally important roles in real industrial systems; however, how to optimize the model weights for the second stage still lacks systematic study. This paper focuses on optimizing the second stage through auto-tuning technology. Although general auto-tuning systems and solutions - both from established production practices and open-source solutions - can address this problem, they typically require weeks or even months to identify a feasible solution. Such prolonged tuning processes are unacceptable in production environments for recommendation systems, as suboptimal value models can severely degrade user experience. An effective auto-tuning solution is required to identify a viable model within 2-3 days, rather than the extended timelines typically associated with existing approaches. In this paper, we introduce a practical auto-tuning system named HyperZero that addresses these time constraints while effectively solving the unique challenges inherent in modern recommendation systems. Moreover, this framework has the potential to be expanded to broader tuning tasks within recommendation systems.