LoopCTR: Unlocking the Loop Scaling Power for Click-Through Rate Prediction

📅 2026-04-21

📈 Citations: 0

✨ Influential: 0

career value

184K/year

🤖 AI Summary

Traditional Transformer-based click-through rate (CTR) prediction models suffer from excessive computational and memory costs due to parameter scale expansion, hindering their industrial deployment. This work proposes LoopCTR, which introduces a novel recurrent scaling paradigm: during training, it decouples computation from parameter growth by recursively reusing shared layers, enhanced with a sandwich architecture, hyper-connected residuals, Mixture-of-Experts (MoE), and intermediate supervision; during inference, it achieves high performance without requiring recurrence. Evaluated on three public benchmarks and one industrial dataset, LoopCTR attains state-of-the-art results. Oracle analysis reveals a remaining performance margin of 0.02–0.04 AUC, and models trained with fewer recurrence steps demonstrate an even higher performance ceiling.

Technology Category

Application Category

📝 Abstract

Scaling Transformer-based click-through rate (CTR) models by stacking more parameters brings growing computational and storage overhead, creating a widening gap between scaling ambitions and the stringent industrial deployment constraints. We propose LoopCTR, which introduces a loop scaling paradigm that increases training-time computation through recursive reuse of shared model layers, decoupling computation from parameter growth. LoopCTR adopts a sandwich architecture enhanced with Hyper-Connected Residuals and Mixture-of-Experts, and employs process supervision at every loop depth to encode multi-loop benefits into the shared parameters. This enables a train-multi-loop, infer-zero-loop strategy where a single forward pass without any loop already outperforms all baselines. Experiments on three public benchmarks and one industrial dataset demonstrate state-of-the-art performance. Oracle analysis further reveals 0.02--0.04 AUC of untapped headroom, with models trained with fewer loops exhibiting higher oracle ceilings, pointing to a promising frontier for adaptive inference.

Problem

Research questions and friction points this paper is trying to address.

Click-Through Rate Prediction

Model Scaling

Computational Overhead

Industrial Deployment Constraints

Transformer-based Models

Innovation

Methods, ideas, or system contributions that make the work stand out.

loop scaling

parameter efficiency

process supervision