๐ค AI Summary
Existing multi-task learning (MTL) recommendation modelsโe.g., MMoE and PLEโneglect both intra-feature interactions and inter-task variations in feature importance, thereby limiting high-order representation learning. To address this, we propose a task-specific feature interaction and sensitivity joint modeling framework. Specifically, we design a multi-path task-adaptive feature interaction module to explicitly capture heterogeneous feature importance across tasks, and introduce a feature-importance-aware gating mechanism for dynamic weight allocation. The entire model is trained end-to-end. On a large-scale e-commerce dataset containing 6.3 billion samples, our method significantly outperforms state-of-the-art MTL baselines including MMoE and PLE. Online A/B testing demonstrates improvements of +3.28% in CTR, +3.10% in order volume, and +2.70% in GMV. Furthermore, extensive experiments on public benchmarks validate its cross-domain generalization capability.
๐ Abstract
Neural-based multi-task learning (MTL) has been successfully applied to many recommendation applications. However, these MTL models (e.g., MMoE, PLE) did not consider feature interaction during the optimization, which is crucial for capturing complex high-order features and has been widely used in ranking models for real-world recommender systems. Moreover, through feature importance analysis across various tasks in MTL, we have observed an interesting divergence phenomenon that the same feature can have significantly different importance across different tasks in MTL. To address these issues, we propose Deep Multiple Task-specific Feature Interactions Network (DTN) with a novel model structure design. DTN introduces multiple diversified task-specific feature interaction methods and task-sensitive network in MTL networks, enabling the model to learn task-specific diversified feature interaction representations, which improves the efficiency of joint representation learning in a general setup. We applied DTN to our company's real-world E-commerce recommendation dataset, which consisted of over 6.3 billion samples, the results demonstrated that DTN significantly outperformed state-of-the-art MTL models. Moreover, during online evaluation of DTN in a large-scale E-commerce recommender system, we observed a 3.28% in clicks, a 3.10% increase in orders and a 2.70% increase in GMV (Gross Merchandise Value) compared to the state-of-the-art MTL models. Finally, extensive offline experiments conducted on public benchmark datasets demonstrate that DTN can be applied to various scenarios beyond recommendations, enhancing the performance of ranking models.