Zenith: Scaling up Ranking Models for Billion-scale Livestreaming Recommendation

📅 2026-01-29

📈 Citations: 0

✨ Influential: 0

career value

199K/year

🤖 AI Summary

This work addresses the challenge of inference latency caused by modeling high-dimensional sparse feature interactions in billion-scale live-streaming recommendation systems. To this end, we propose the Zenith architecture, which tokenizes high-dimensional features and introduces two key components—Token Fusion and Token Boost—to efficiently identify and prioritize a small set of critical features (termed Prime Tokens). By enhancing token heterogeneity, Zenith improves model performance while maintaining controlled inference costs. The approach significantly advances model scaling laws and demonstrates strong empirical gains: after deployment on TikTok Live, it achieves a 1.05% increase in CTR AUC, a 1.10% reduction in Logloss, and boosts both the number and duration of high-quality viewing sessions by 9.93% and 8.11%, respectively.

Technology Category

Application Category

📝 Abstract

Accurately capturing feature interactions is essential in recommender systems, and recent trends show that scaling up model capacity could be a key driver for next-level predictive performance. While prior work has explored various model architectures to capture multi-granularity feature interactions, relatively little attention has been paid to efficient feature handling and scaling model capacity without incurring excessive inference latency. In this paper, we address this by presenting Zenith, a scalable and efficient ranking architecture that learns complex feature interactions with minimal runtime overhead. Zenith is designed to handle a few high-dimensional Prime Tokens with Token Fusion and Token Boost modules, which exhibits superior scaling laws compared to other state-of-the-art ranking methods, thanks to its improved token heterogeneity. Its real-world effectiveness is demonstrated by deploying the architecture to TikTok Live, a leading online livestreaming platform that attracts billions of users globally. Our A/B test shows that Zenith achieves +1.05%/-1.10% in online CTR AUC and Logloss, and realizes +9.93% gains in Quality Watch Session / User and +8.11% in Quality Watch Duration / User.

Problem

Research questions and friction points this paper is trying to address.

feature interactions

model scaling

inference latency

ranking models

livestreaming recommendation

Innovation

Methods, ideas, or system contributions that make the work stand out.

Prime Tokens

Token Fusion

Token Boost