CR-LSO: Convex Neural Architecture Optimization in the Latent Space of Graph Variational Autoencoder with Input Convex Neural Networks

📅 2022-11-11
🏛️ arXiv.org
📈 Citations: 3
Influential: 1
📄 PDF
🤖 AI Summary
To address the instability of gradient-based optimization in latent space optimization (LSO) for neural architecture search (NAS), caused by the non-convexity of the architecture-performance mapping, this paper proposes Convex-Regularized Latent Space Optimization (CR-LSO). CR-LSO models architectural topologies using a graph variational autoencoder (G-VAE) and—novelly in NAS—introduces input-convex neural networks (ICNNs) to enforce explicit convexity on the latent-to-performance mapping. By jointly learning a convex latent space and a differentiable surrogate model, CR-LSO ensures convergence and robustness of gradient optimization. Evaluated on NAS-Bench-101, NAS-Bench-201, and NAS-Bench-301, CR-LSO achieves competitive search performance with low computational overhead, significantly improving both optimization stability and the quality of discovered architectures.
📝 Abstract
In neural architecture search (NAS) methods based on latent space optimization (LSO), a deep generative model is trained to embed discrete neural architectures into a continuous latent space. In this case, different optimization algorithms that operate in the continuous space can be implemented to search neural architectures. However, the optimization of latent variables is challenging for gradient-based LSO since the mapping from the latent space to the architecture performance is generally non-convex. To tackle this problem, this paper develops a convexity regularized latent space optimization (CR-LSO) method, which aims to regularize the learning process of latent space in order to obtain a convex architecture performance mapping. Specifically, CR-LSO trains a graph variational autoencoder (G-VAE) to learn the continuous representations of discrete architectures. Simultaneously, the learning process of latent space is regularized by the guaranteed convexity of input convex neural networks (ICNNs). In this way, the G-VAE is forced to learn a convex mapping from the architecture representation to the architecture performance. Hereafter, the CR-LSO approximates the performance mapping using the ICNN and leverages the estimated gradient to optimize neural architecture representations. Experimental results on three popular NAS benchmarks show that CR-LSO achieves competitive evaluation results in terms of both computational complexity and architecture performance.
Problem

Research questions and friction points this paper is trying to address.

Optimizing neural architectures in non-convex latent spaces
Regularizing latent space learning for convex performance mapping
Improving gradient-based architecture search via convexity regularization
Innovation

Methods, ideas, or system contributions that make the work stand out.

Uses Graph Variational Autoencoder for architecture representation
Regularizes latent space with Input Convex Neural Networks
Optimizes neural architectures via convex performance mapping