🤖 AI Summary
To address three critical limitations of DCNv2 in large-scale recommendation systems—(i) information attenuation in deep cross layers, (ii) coarse-grained implicit handling of feature hash collisions, and (iii) absence of explicit second-order feature interaction modeling—this paper proposes Triple-DCN. Our method introduces: (1) an enhanced Cross layer to mitigate gradient and information vanishing in deep cross operations; (2) learnable lookup-level embedding weights for fine-grained suppression of hash collision bias; and (3) an FFM-style explicit pairwise similarity modeling layer to capture highly discriminative second-order interactions. Triple-DCN achieves >500 million predictions per second in online inference. Extensive offline evaluations and production A/B tests consistently outperform DCNv2, establishing new state-of-the-art results on four major benchmarks—Criteo, Avazu, KuaiRec, and Alibaba-CCP.
📝 Abstract
The Deep and Cross architecture (DCNv2) is a robust production baseline and is integral to numerous real-life recommender systems. Its inherent efficiency and ability to model interactions often result in models that are both simpler and highly competitive compared to more computationally demanding alternatives, such as Deep FFMs. In this work, we introduce three significant algorithmic improvements to the DCNv2 architecture, detailing their formulation and behavior at scale. The enhanced architecture we refer to as DCN^2 is actively used in a live recommender system, processing over 0.5 billion predictions per second across diverse use cases where it out-performed DCNv2, both offline and online (ab tests). These improvements effectively address key limitations observed in the DCNv2, including information loss in Cross layers, implicit management of collisions through learnable lookup-level weights, and explicit modeling of pairwise similarities with a custom layer that emulates FFMs' behavior. The superior performance of DCN^2 is also demonstrated on four publicly available benchmark data sets.