Deferred is Better: A Framework for Multi-Granularity Deferred Interaction of Heterogeneous Features

📅 2026-03-12
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the challenge in CTR prediction where heterogeneous features—such as sparse IDs and dense numerical values—suffer from suboptimal performance when forced into uniform interaction strategies, often introducing noise and obscuring informative signals. To mitigate this, we propose the Multi-Granularity Information-aware Delayed Interaction Network (MGDIN), which adaptively controls the timing of interactions among feature groups across deep network layers through multi-granularity feature grouping and a hierarchical masking mechanism. MGDIN prioritizes high-information features to construct robust initial representations and gradually incorporates sparse features in later stages. This delayed interaction strategy effectively alleviates the adverse effects of feature sparsity, significantly enhancing both the robustness and predictive accuracy of CTR models while preventing performance degradation caused by premature interaction of low-information features.

Technology Category

Application Category

📝 Abstract
Click-through rate (CTR) prediction models estimates the probability of a user-item click by modeling interactions across a vast feature space. A fundamental yet often overlooked challenge is the inherent heterogeneity of these features: their sparsity and information content vary dramatically. For instance, categorical features like item IDs are extremely sparse, whereas numerical features like item price are relatively dense. Prevailing CTR models have largely ignored this heterogeneity, employing a uniform feature interaction strategy that inputs all features into the interaction layers simultaneously. This approach is suboptimal, as the premature introduction of low-information features can inject significant noise and mask the signals from information-rich features, which leads to model collapse and hinders the learning of robust representations. To address the above challenge, we propose a Multi-Granularity Information-Aware Deferred Interaction Network (MGDIN), which adaptively defers the introduction of features into the feature interaction process. MGDIN's core mechanism operates in two stages: First, it employs a multi-granularity feature grouping strategy to partition the raw features into distinct groups with more homogeneous information density in different granularities, thereby mitigating the effects of extreme individual feature sparsity and enabling the model to capture feature interactions from diverse perspectives. Second, a delayed interaction mechanism is implemented through a hierarchical masking strategy, which governs when and how each group participates by masking low-information groups in the early layers and progressively unmasking them as the network deepens. This deferred introduction allows the model to establish a robust understanding based on high-information features before gradually incorporating sparser information from other groups...
Problem

Research questions and friction points this paper is trying to address.

feature heterogeneity
click-through rate prediction
feature interaction
information sparsity
noisy feature integration
Innovation

Methods, ideas, or system contributions that make the work stand out.

deferred interaction
feature heterogeneity
multi-granularity grouping
hierarchical masking
CTR prediction
🔎 Similar Papers
No similar papers found.
Y
Yi Xu
Alibaba Group
Moyu Zhang
Moyu Zhang
Beijing University of Posts and Telecommunications、Alibaba Group
Knowledge TracingInformation RetrievalRecommender System
C
Chaofan Fan
Alibaba Group
Jinxin Hu
Jinxin Hu
Alibaba
Y
Yu Zhang
Alibaba Group
X
Xiaoyi Zeng
Alibaba Group