🤖 AI Summary
Traditional knowledge distillation, which transfers only the final scalar predictions, struggles to effectively convey the rich intermediate representations of large models, thereby limiting knowledge transfer efficiency. This work proposes LoopFM, a novel framework that, for the first time, structures historical intermediate embeddings from foundation models as input features for downstream recommendation models. LoopFM establishes a high-bandwidth knowledge transfer channel that requires neither online inference nor architectural coupling, complementing conventional distillation approaches. Empirical evaluations demonstrate consistent improvements: on public benchmarks, it achieves over 6% absolute AUC gains; in industrial deployment, it doubles the effective knowledge transfer ratio and yields conversion rate improvements ranging from 0.5% to 1.22%.
📝 Abstract
Knowledge distillation (KD) transfers a single scalar prediction from a large foundation model (FM) to compact vertical models (VMs), suffering from diminishing transfer ratio -- the fraction of FM improvement captured by the VM -- as a single scalar cannot convey the rich intermediate knowledge that larger FMs learn. To address this bottleneck, we propose LoopFM (Learning frOm HistOrical ReP*resentations of FM), a framework that opens a high-bandwidth transfer channel by structuring FM intermediate embeddings as input features (e.g., user history sequence) for downstream VMs, without requiring real-time FM inference at serving and architectural coupling between FM and VM. We provide a theoretical framework for LoopFM with a gain decomposition and transfer-ratio analysis. On three public benchmarks, LoopFM demonstrates strong AUC improvements (e.g., 6\%+ on TaobaoAd) and complementary knowledge transfer capability with KD. On industrial-scale systems (billions of examples, trillion-parameter FMs), LoopFM approximately doubles the knowledge transfer ratio on top of KD, delivering a +0.5\% conversion improvement in Y1H1, and a +1.03\% and +1.22\% conversion improvement from two individual launches respectively in Y1H2.