HyperNAS: Enhancing Architecture Representation for NAS Predictor via Hypernetwork

📅 2025-09-16

📈 Citations: 0

✨ Influential: 0

career value

177K/year

🤖 AI Summary

To address the weak generalization capability and heavy reliance on performance evaluations in neural architecture search (NAS), this paper proposes a hypernetwork-based architecture representation learning framework. The method introduces three key innovations: (1) a global encoding scheme that jointly models topological structure and operation semantics across heterogeneous architectures; (2) a shared hypernetwork architecture that explicitly captures inter-architecture relationships to enhance transferability; and (3) a dynamic adaptive multi-task loss that jointly optimizes prediction accuracy and training stability. Trained efficiently with only a small number of proxy dataset samples, the approach achieves significant improvements in both predictive accuracy and search efficiency across five mainstream NAS search spaces. It attains 97.60% top-1 accuracy on CIFAR-10 and 82.4% on ImageNet, establishing new state-of-the-art performance with less than one-fifth the evaluation budget required by conventional methods.

Technology Category

Application Category

📝 Abstract

Time-intensive performance evaluations significantly impede progress in Neural Architecture Search (NAS). To address this, neural predictors leverage surrogate models trained on proxy datasets, allowing for direct performance predictions for new architectures. However, these predictors often exhibit poor generalization due to their limited ability to capture intricate relationships among various architectures. In this paper, we propose HyperNAS, a novel neural predictor paradigm for enhancing architecture representation learning. HyperNAS consists of two primary components: a global encoding scheme and a shared hypernetwork. The global encoding scheme is devised to capture the comprehensive macro-structure information, while the shared hypernetwork serves as an auxiliary task to enhance the investigation of inter-architecture patterns. To ensure training stability, we further develop a dynamic adaptive multi-task loss to facilitate personalized exploration on the Pareto front. Extensive experiments across five representative search spaces, including ViTs, demonstrate the advantages of HyperNAS, particularly in few-shot scenarios. For instance, HyperNAS strikes new state-of-the-art results, with 97.60% top-1 accuracy on CIFAR-10 and 82.4% top-1 accuracy on ImageNet, using at least 5.0$ imes$ fewer samples.

Problem

Research questions and friction points this paper is trying to address.

Time-intensive performance evaluations hinder Neural Architecture Search progress

Existing neural predictors show poor generalization across different architectures

Limited ability to capture intricate relationships among various architectures

Innovation

Methods, ideas, or system contributions that make the work stand out.

Global encoding scheme captures macro-structure information

Shared hypernetwork enhances inter-architecture pattern investigation

Dynamic adaptive multi-task loss ensures training stability

🔎 Similar Papers

Graph is all you need? Lightweight data-agnostic neural architecture search without training