Transferrable Surrogates in Expressive Neural Architecture Search Spaces

📅 2025-04-17

📈 Citations: 0

✨ Influential: 0

career value

209K/year

🤖 AI Summary

Neural architecture search (NAS) faces significant challenges in highly expressive search spaces—such as context-free grammar–based structural spaces—including prohibitive evaluation costs and poor cross-dataset generalization. To address this, we propose a transferable surrogate modeling framework that jointly leverages zero-cost proxies, neural graph features (GRAF), and fine-tuned large language models to enable accurate cross-dataset architectural performance prediction. Our work is the first to systematically demonstrate strong generalization of such surrogate models in cross-domain settings, supporting both architecture pre-screening and direct substitution of expensive training objectives. The method substantially reduces NAS computational overhead, discovers superior architectures on unseen datasets, and achieves high prediction accuracy with robust transferability—effectively balancing search efficiency and final model performance.

Technology Category

Application Category

📝 Abstract

Neural architecture search (NAS) faces a challenge in balancing the exploration of expressive, broad search spaces that enable architectural innovation with the need for efficient evaluation of architectures to effectively search such spaces. We investigate surrogate model training for improving search in highly expressive NAS search spaces based on context-free grammars. We show that i) surrogate models trained either using zero-cost-proxy metrics and neural graph features (GRAF) or by fine-tuning an off-the-shelf LM have high predictive power for the performance of architectures both within and across datasets, ii) these surrogates can be used to filter out bad architectures when searching on novel datasets, thereby significantly speeding up search and achieving better final performances, and iii) the surrogates can be further used directly as the search objective for huge speed-ups.

Problem

Research questions and friction points this paper is trying to address.

Balancing exploration and evaluation in expressive NAS spaces

Improving search efficiency with surrogate models in NAS

Enhancing architecture performance prediction across datasets

Innovation

Methods, ideas, or system contributions that make the work stand out.

Uses zero-cost-proxy metrics and GRAF features

Fine-tunes off-the-shelf LM for predictions

Surrogates filter bad architectures for speed

🔎 Similar Papers

Multi-objective Differentiable Neural Architecture Search