InterFormer: Towards Effective Heterogeneous Interaction Learning for Click-Through Rate Prediction

πŸ“… 2024-11-15
πŸ›οΈ arXiv.org
πŸ“ˆ Citations: 0
✨ Influential: 0
πŸ“„ PDF
πŸ€– AI Summary
Existing CTR models suffer from insufficient cross-modal interaction and information loss due to premature aggregation when fusing heterogeneous user signalsβ€”e.g., static profiles and dynamic behavioral sequences. To address this, we propose InterFormer, a Transformer-based interleaved heterogeneous interaction architecture. It employs bidirectional cross-modal attention for fine-grained feature alignment and introduces a modality-decoupled bridging mechanism that preserves the integrity of each information source while enabling on-demand selective aggregation. This design avoids early fusion, thereby enhancing semantic modeling capability. Extensive experiments demonstrate that InterFormer achieves state-of-the-art performance on three public benchmarks and one large-scale industrial dataset, validating both its effectiveness and generalizability across diverse data regimes.

Technology Category

Application Category

πŸ“ Abstract
Click-through rate (CTR) prediction, which predicts the probability of a user clicking an ad, is a fundamental task in recommender systems. The emergence of heterogeneous information, such as user profile and behavior sequences, depicts user interests from different aspects. A mutually beneficial integration of heterogeneous information is the cornerstone towards the success of CTR prediction. However, most of the existing methods suffer from two fundamental limitations, including (1) insufficient inter-mode interaction due to the unidirectional information flow between modes, and (2) aggressive information aggregation caused by early summarization, resulting in excessive information loss. To address the above limitations, we propose a novel module named InterFormer to learn heterogeneous information interaction in an interleaving style. To achieve better interaction learning, InterFormer enables bidirectional information flow for mutually beneficial learning across different modes. To avoid aggressive information aggregation, we retain complete information in each data mode and use a separate bridging arch for effective information selection and summarization. Our proposed InterFormer achieves state-of-the-art performance on three public datasets and a large-scale industrial dataset.
Problem

Research questions and friction points this paper is trying to address.

Click-Through Rate Prediction
Information Integration
Accuracy Improvement
Innovation

Methods, ideas, or system contributions that make the work stand out.

InterFormer
Cross-type Information Exchange
CTR Prediction
πŸ”Ž Similar Papers
No similar papers found.