🤖 AI Summary
Gradient-based one-shot neural architecture search (NAS) suffers from two key issues: (1) heavy reliance on noisy DARTS-style benchmarks, hindering reliable validation of performance gains; and (2) fragmented, non-unified implementations, impeding fair comparison and reproducibility. To address these, we introduce Confopt—the first modular, extensible open-source library unifying implementation and evaluation of gradient-based one-shot NAS. Its core contributions are: (1) decoupling optimizer components from search spaces for flexible configuration; (2) a decomposition-based training pipeline and multi-benchmark evaluation environment; and (3) a novel evaluation protocol exposing systematic biases in existing methods under standard assessment. Leveraging Confopt, we construct several robustness-enhanced benchmarks that significantly improve evaluation reliability and reproducibility. Confopt establishes a standardized, scientifically rigorous foundation for NAS research and fair method comparison.
📝 Abstract
Gradient-based one-shot neural architecture search (NAS) has significantly reduced the cost of exploring architectural spaces with discrete design choices, such as selecting operations within a model. However, the field faces two major challenges. First, evaluations of gradient-based NAS methods heavily rely on the DARTS benchmark, despite the existence of other available benchmarks. This overreliance has led to saturation, with reported improvements often falling within the margin of noise. Second, implementations of gradient-based one-shot NAS methods are fragmented across disparate repositories, complicating fair and reproducible comparisons and further development. In this paper, we introduce Configurable Optimizer (confopt), an extensible library designed to streamline the development and evaluation of gradient-based one-shot NAS methods. Confopt provides a minimal API that makes it easy for users to integrate new search spaces, while also supporting the decomposition of NAS optimizers into their core components. We use this framework to create a suite of new DARTS-based benchmarks, and combine them with a novel evaluation protocol to reveal a critical flaw in how gradient-based one-shot NAS methods are currently assessed. The code can be found at https://github.com/automl/ConfigurableOptimizer.