DistDNAS: Search Efficient Feature Interactions within 2 Hours

📅 2023-11-01

🏛️ BigData Congress [Services Society]

📈 Citations: 1

✨ Influential: 0

career value

229K/year

🤖 AI Summary

Feature interaction design in recommender systems suffers from high search costs and severe redundancy-conflict issues, hindering both model performance and serving efficiency. To address this, we propose a distributed differentiable neural architecture search (DNAS) framework. Our method introduces two key innovations: (1) a cross-date distributed search mechanism that parallelizes architecture exploration across temporal data partitions, and (2) a differentiable cost-aware loss function that jointly optimizes accuracy and computational efficiency. A unified supernet models heterogeneous and multi-order feature interactions, while distributed gradient aggregation and differentiable FLOPs constraints enforce hardware-aware optimization. Evaluated on the Criteo Terabyte (1TB) dataset, our approach reduces search time from 48 hours to 2 hours (25× speedup), cuts inference FLOPs by 60%, and improves AUC by 0.001. The framework achieves a favorable trade-off among search efficiency, model accuracy, and serving lightweightness, effectively alleviating the efficiency bottleneck in feature interaction design.

📝 Abstract

Search efficiency and serving efficiency are two major axes in building feature interactions and expediting the model development process in recommender systems. Searching for the optimal feature interaction design on large-scale benchmarks requires extensive cost due to the sequential workflow on the large volume of data. In addition, fusing interactions of various sources, orders, and mathematical operations introduces potential conflicts and additional redundancy toward recommender models, leading to sub-optimal trade-offs in performance and serving cost. This paper presents DistDNAS as a neat solution to brew swift and efficient feature interaction design. DistDNAS proposes a supernet incorporating interaction modules of varying orders and types as a search space. To optimize search efficiency, DistDNAS distributes the search and aggregates the choice of optimal interaction modules on varying data dates, achieving a speed-up of over 25× and reducing the search cost from 2 days to 2 hours. To optimize serving efficiency, DistDNAS introduces a differentiable cost-aware loss to penalize the selection of redundant interaction modules, enhancing the efficiency of discovered feature interactions in serving. We extensively evaluate the best models crafted by DistDNAS on a 1TB Criteo Terabyte dataset. Experimental evaluations demonstrate 0.001 AUC improvement and 60% FLOPs saving over current state-of-the-art CTR models.

Problem

Research questions and friction points this paper is trying to address.

Efficiently search optimal feature interactions in recommender systems

Reduce conflicts and redundancy in feature interaction fusion

Achieve faster search and lower serving costs simultaneously

Innovation

Methods, ideas, or system contributions that make the work stand out.

Supernet integrates diverse interaction modules efficiently

Distributed search reduces cost from days to hours

Differentiable loss minimizes redundant interaction modules

🔎 Similar Papers

No similar papers found.