A Unified Knowledge-Distillation and Semi-Supervised Learning Framework to Improve Industrial Ads Delivery Systems

📅 2025-02-05

📈 Citations: 0

✨ Influential: 0

career value

194K/year

🤖 AI Summary

To address label scarcity, train-serving distribution shift, and model overfitting in industrial ad ranking, this paper proposes the first end-to-end unified framework integrating knowledge distillation and semi-supervised learning for trillion-scale ad scenarios. The method jointly incorporates teacher-student distillation, consistency regularization, dynamic pseudo-label optimization, and multi-stage calibration modeling—thereby theoretically characterizing and empirically mitigating calibration bias across multi-stage ranking systems for the first time. Leveraging massive unlabeled impression data, it significantly improves CTR/CVR estimation accuracy and generalization. The framework has been successfully deployed across diverse models, devices, and geographic regions in production systems, serving billions of users with controlled inference overhead.

Technology Category

Application Category

📝 Abstract

Industrial ads ranking systems conventionally rely on labeled impression data, which leads to challenges such as overfitting, slower incremental gain from model scaling, and biases due to discrepancies between training and serving data. To overcome these issues, we propose a Unified framework for Knowledge-Distillation and Semi-supervised Learning (UKDSL) for ads ranking, empowering the training of models on a significantly larger and more diverse datasets, thereby reducing overfitting and mitigating training-serving data discrepancies. We provide detailed formal analysis and numerical simulations on the inherent miscalibration and prediction bias of multi-stage ranking systems, and show empirical evidence of the proposed framework's capability to mitigate those. Compared to prior work, UKDSL can enable models to learn from a much larger set of unlabeled data, hence, improving the performance while being computationally efficient. Finally, we report the successful deployment of UKDSL in an industrial setting across various ranking models, serving users at multi-billion scale, across various surfaces, geological locations, clients, and optimize for various events, which to the best of our knowledge is the first of its kind in terms of the scale and efficiency at which it operates.

Problem

Research questions and friction points this paper is trying to address.

Overcoming overfitting in ads ranking

Mitigating training-serving data discrepancies

Enhancing model performance with unlabeled data

Innovation

Methods, ideas, or system contributions that make the work stand out.

Unified Knowledge-Distillation Framework

Semi-supervised Learning for Ads

Scalable Industrial Model Deployment

🔎 Similar Papers

No similar papers found.