HIT Model: A Hierarchical Interaction-Enhanced Two-Tower Model for Pre-Ranking Systems

📅 2025-05-26

📈 Citations: 0

✨ Influential: 0

career value

192K/year

🤖 AI Summary

To address the limitations of decoupled dual-tower models in online advertising pre-ranking—including insufficient cross-domain interaction, coarse-grained similarity measurement, and weak modeling of user-ad relationships—this paper proposes a Generator-Representor Collaborative Dual-Tower Architecture. It introduces two generators to pre-model coarse-grained user-ad interactions and designs a multi-head representor to jointly learn fine-grained user interests and ad attributes within multiple semantic subspaces. Key innovations include cosine-based generator loss, multi-head embedding projection, and hierarchical interaction enhancement. Deployed on Tencent’s advertising platform, the method serves over one billion daily impressions, achieving a 1.66% lift in GMV and a 1.55% improvement in ROI, while maintaining inference latency comparable to the baseline. The approach significantly enhances both effectiveness and deployability of large-scale pre-ranking systems.

Technology Category

Application Category

📝 Abstract

Online display advertising platforms rely on pre-ranking systems to efficiently filter and prioritize candidate ads from large corpora, balancing relevance to users with strict computational constraints. The prevailing two-tower architecture, though highly efficient due to its decoupled design and pre-caching, suffers from cross-domain interaction and coarse similarity metrics, undermining its capacity to model complex user-ad relationships. In this study, we propose the Hierarchical Interaction-Enhanced Two-Tower (HIT) model, a new architecture that augments the two-tower paradigm with two key components: $ extit{generators}$ that pre-generate holistic vectors incorporating coarse-grained user-ad interactions through a dual-generator framework with a cosine-similarity-based generation loss as the training objective, and $ extit{multi-head representers}$ that project embeddings into multiple latent subspaces to capture fine-grained, multi-faceted user interests and multi-dimensional ad attributes. This design enhances modeling effectiveness without compromising inference efficiency. Extensive experiments on public datasets and large-scale online A/B testing on Tencent's advertising platform demonstrate that HIT significantly outperforms several baselines in relevance metrics, yielding a $1.66%$ increase in Gross Merchandise Volume and a $1.55%$ improvement in Return on Investment, alongside similar serving latency to the vanilla two-tower models. The HIT model has been successfully deployed in Tencent's online display advertising system, serving billions of impressions daily. The code is available at https://anonymous.4open.science/r/HIT_model-5C23.

Problem

Research questions and friction points this paper is trying to address.

Enhance cross-domain interaction in two-tower pre-ranking systems

Improve coarse similarity metrics for complex user-ad relationships

Balance modeling effectiveness with computational efficiency

Innovation

Methods, ideas, or system contributions that make the work stand out.

Hierarchical Interaction-Enhanced Two-Tower architecture

Dual-generator framework with cosine-similarity loss

Multi-head representers for multi-faceted embeddings

🔎 Similar Papers

Hierarchical Structured Neural Network: Efficient Retrieval Scaling for Large Scale Recommendation