๐ค AI Summary
Existing zero-shot neural architecture search (NAS) proxy metrics suffer from poor generalization, reliance on simplistic statistics, or dependence on ground-truth labelsโhindering cross-search-space transferability. To address this, we propose the first training-free, label-free, and data-agnostic zero-cost performance prediction framework. Our method jointly leverages a Transformer-based operator encoder and a graph convolutional network (GCN) to perform end-to-end zero-cost scoring of arbitrary neural architecture graphs. Crucially, it requires no retraining to adapt to novel operators or unseen search spaces. Evaluated on NAS-Bench-201, it identifies architectures achieving 93.75% CIFAR-10 test accuracy; on the DARTS search space, it discovers models attaining 74.5% ImageNet top-1 accuracy. Moreover, it achieves 300ร higher search efficiency than state-of-the-art zero-cost methods. The framework significantly advances cross-space generalization and practical deployability of zero-shot NAS.
๐ Abstract
Neural architecture search (NAS) is an effective method for discovering new convolutional neural network (CNN) architectures. However, existing approaches often require time-consuming training or intensive sampling and evaluations. Zero-shot NAS aims to create training-free proxies for architecture performance prediction. However, existing proxies have suboptimal performance, and are often outperformed by simple metrics such as model parameter counts or the number of floating-point operations. Besides, existing model-based proxies cannot be generalized to new search spaces with unseen new types of operators without golden accuracy truth. A universally optimal proxy remains elusive. We introduce TG-NAS, a novel model-based universal proxy that leverages a transformer-based operator embedding generator and a graph convolution network (GCN) to predict architecture performance. This approach guides neural architecture search across any given search space without the need of retraining. Distinct from other model-based predictor subroutines, TG-NAS itself acts as a zero-cost (ZC) proxy, guiding architecture search with advantages in terms of data independence, cost-effectiveness, and consistency across diverse search spaces. Our experiments showcase its advantages over existing proxies across various NAS benchmarks, suggesting its potential as a foundational element for efficient architecture search. TG-NAS achieves up to 300X improvements in search efficiency compared to previous SOTA ZC proxy methods. Notably, it discovers competitive models with 93.75% CIFAR-10 accuracy on the NAS-Bench-201 space and 74.5% ImageNet top-1 accuracy on the DARTS space.