Dextr: Zero-Shot Neural Architecture Search with Singular Value Decomposition and Extrinsic Curvature

📅 2025-08-18

📈 Citations: 0

✨ Influential: 0

🤖 AI Summary

Existing zero-shot neural architecture search (NAS) methods rely on labeled data and struggle to jointly optimize architectural convergence, generalization, and representational capacity. To address this, we propose a label-free, zero-cost proxy metric that unifies two intrinsic architectural properties: (i) layer-wise feature condition numbers—derived via singular value decomposition—to jointly characterize convergence behavior and representational expressivity; and (ii) extrinsic curvature of network outputs—to quantify nonlinear generalization capability. Critically, our metric requires only a single unlabeled input sample for accurate performance prediction. It eliminates dependence on ground-truth labels entirely. Evaluated across DARTS, AutoFormer, and multiple NAS benchmarks, it achieves significantly improved proxy correlation—averaging +12.3%—while substantially enhancing both search efficiency and top-1 accuracy. This work establishes a new paradigm for practical, high-efficiency zero-shot NAS in real-world settings.

Technology Category

Application Category

📝 Abstract

Zero-shot Neural Architecture Search (NAS) typically optimises the architecture search process by exploiting the network or gradient properties at initialisation through zero-cost proxies. The existing proxies often rely on labelled data, which is usually unavailable in real-world settings. Furthermore, the majority of the current methods focus either on optimising the convergence and generalisation attributes or solely on the expressivity of the network architectures. To address both limitations, we first demonstrate how channel collinearity affects the convergence and generalisation properties of a neural network. Then, by incorporating the convergence, generalisation and expressivity in one approach, we propose a zero-cost proxy that omits the requirement of labelled data for its computation. In particular, we leverage the Singular Value Decomposition (SVD) of the neural network layer features and the extrinsic curvature of the network output to design our proxy. %As a result, the proposed proxy is formulated as the simplified harmonic mean of the logarithms of two key components: the sum of the inverse of the feature condition number and the extrinsic curvature of the network output. Our approach enables accurate prediction of network performance on test data using only a single label-free data sample. Our extensive evaluation includes a total of six experiments, including the Convolutional Neural Network (CNN) search space, i.e. DARTS and the Transformer search space, i.e. AutoFormer. The proposed proxy demonstrates a superior performance on multiple correlation benchmarks, including NAS-Bench-101, NAS-Bench-201, and TransNAS-Bench-101-micro; as well as on the NAS task within the DARTS and the AutoFormer search space, all while being notably efficient. The code is available at https://github.com/rohanasthana/Dextr.

Problem

Research questions and friction points this paper is trying to address.

Eliminates need for labeled data in zero-shot NAS

Combines convergence, generalization, and expressivity in one proxy

Uses SVD and extrinsic curvature for architecture performance prediction

Innovation

Methods, ideas, or system contributions that make the work stand out.

Uses SVD for neural network feature analysis

Incorporates extrinsic curvature of network output

Omits labelled data with zero-cost proxy

🔎 Similar Papers

No similar papers found.

Authors to Follow