Weight-Entanglement Meets Gradient-Based Neural Architecture Search

📅 2023-12-16

🏛️ AutoML

📈 Citations: 5

✨ Influential: 0

career value

196K/year

🤖 AI Summary

This work addresses the challenge of inefficient optimization in gradient-based neural architecture search (NAS) within weight-entangled search spaces. We propose the first differentiable NAS framework supporting macro-architecture-level weight sharing. Methodologically, we design a novel gradient optimization mechanism that tightly integrates continuous architecture relaxation, hypernetwork parameter sharing, and weight entanglement—enabling end-to-end differentiability while preserving memory efficiency. Our key contribution is the first successful adaptation of gradient-based NAS to highly coupled weight-entangled spaces, overcoming the representational limitations of conventional weight-sharing NAS approaches. Evaluated on ImageNet, our method achieves state-of-the-art accuracy-efficiency trade-offs: 42% reduction in GPU memory consumption and 3.1× acceleration in search speed. The implementation is publicly available.

📝 Abstract

Weight sharing is a fundamental concept in neural architecture search (NAS), enabling gradient-based methods to explore cell-based architectural spaces significantly faster than traditional black-box approaches. In parallel, weight-entanglement has emerged as a technique for more intricate parameter sharing amongst macro-architectural spaces. Since weight-entanglement is not directly compatible with gradient-based NAS methods, these two paradigms have largely developed independently in parallel sub-communities. This paper aims to bridge the gap between these sub-communities by proposing a novel scheme to adapt gradient-based methods for weight-entangled spaces. This enables us to conduct an in-depth comparative assessment and analysis of the performance of gradient-based NAS in weight-entangled search spaces. Our findings reveal that this integration of weight-entanglement and gradient-based NAS brings forth the various benefits of gradient-based methods, while preserving the memory efficiency of weight-entangled spaces. The code for our work is openly accessible https://github.com/automl/TangleNAS.

Problem

Research questions and friction points this paper is trying to address.

Bridging weight-entanglement with gradient-based NAS methods

Enabling comparative analysis in weight-entangled search spaces

Integrating memory efficiency with gradient-based optimization benefits

Innovation

Methods, ideas, or system contributions that make the work stand out.

Adapts gradient-based NAS for weight-entangled spaces

Enables comparative analysis of performance in entangled spaces

Combines gradient benefits with memory efficiency of entanglement

🔎 Similar Papers

Optimizing Time Series Forecasting Architectures: A Hierarchical Neural Architecture Search Approach