Robust and Efficient Transfer Learning via Supernet Transfer in Warm-started Neural Architecture Search

πŸ“… 2024-07-26
πŸ›οΈ arXiv.org
πŸ“ˆ Citations: 0
✨ Influential: 0
πŸ“„ PDF
πŸ€– AI Summary
Neural architecture search (NAS) remains hindered by prohibitive computational costs. To address this, we propose Warm-Start Supernetwork Transfer NASβ€”a novel framework that synergistically integrates optimal transport theory with multi-dataset joint pretraining to enable efficient cross-task supernetwork parameter reuse. Our method supports zero-shot forward transfer, substantially enhancing the robustness and generalization of differentiable NAS approaches (e.g., DARTS). Evaluated across dozens of image classification benchmarks, it accelerates supernetwork training by 3–5Γ—; searched architectures consistently outperform from-scratch baselines and achieve positive forward transfer on nearly all target datasets. Key contributions are: (1) the first application of optimal transport to supernetwork transfer in NAS; (2) a unified framework co-optimizing multi-dataset pretraining and parameter transfer; and (3) a practical NAS paradigm balancing high efficiency with strong generalization.

Technology Category

Application Category

πŸ“ Abstract
Hand-designing Neural Networks is a tedious process that requires significant expertise. Neural Architecture Search (NAS) frameworks offer a very useful and popular solution that helps to democratize AI. However, these NAS frameworks are often computationally expensive to run, which limits their applicability and accessibility. In this paper, we propose a novel transfer learning approach, capable of effectively transferring pretrained supernets based on Optimal Transport or multi-dataset pretaining. This method can be generally applied to NAS methods based on Differentiable Architecture Search (DARTS). Through extensive experiments across dozens of image classification tasks, we demonstrate that transferring pretrained supernets in this way can not only drastically speed up the supernet training which then finds optimal models (3 to 5 times faster on average), but even yield that outperform those found when running DARTS methods from scratch. We also observe positive transfer to almost all target datasets, making it very robust. Besides drastically improving the applicability of NAS methods, this also opens up new applications for continual learning and related fields.
Problem

Research questions and friction points this paper is trying to address.

Reducing computational cost of Neural Architecture Search (NAS)
Improving efficiency of supernet training via transfer learning
Enhancing robustness and applicability of NAS methods
Innovation

Methods, ideas, or system contributions that make the work stand out.

Transfer pretrained supernets via Optimal Transport
Speed up supernet training significantly
Applicable to DARTS-based NAS methods
πŸ”Ž Similar Papers
No similar papers found.