SuperTickets: Drawing Task-Agnostic Lottery Tickets from Supernets via Jointly Architecture Searching and Parameter Pruning

📅 2022-07-08

🏛️ European Conference on Computer Vision

📈 Citations: 13

✨ Influential: 1

career value

186K/year

🤖 AI Summary

To address the challenges of jointly identifying efficient subnetworks and “lottery ticket” subnetworks in supernets—namely, low efficiency, cumbersome multi-stage pipelines, and poor cross-task generalization—this paper proposes SuperTickets, the first framework enabling end-to-end differentiable co-optimization of neural architecture search (NAS) and parameter pruning. Its core innovations include dynamic connectivity modeling and progressive sparsity optimization, which jointly learn subnetwork topology and pruning masks in a unified differentiable training process—eliminating the conventional search-train-prune-retrain cycle. Extensive experiments across three vision tasks and four benchmark datasets demonstrate that SuperTickets consistently outperforms state-of-the-art NAS and pruning methods, achieving higher accuracy with reduced computational cost. Moreover, the discovered lottery-ticket subnetworks exhibit strong cross-task transferability: they maintain superior performance without task-specific retraining.

📝 Abstract

Neural architecture search (NAS) has demonstrated amazing success in searching for efficient deep neural networks (DNNs) from a given supernet. In parallel, the lottery ticket hypothesis has shown that DNNs contain small subnetworks that can be trained from scratch to achieve a comparable or higher accuracy than original DNNs. As such, it is currently a common practice to develop efficient DNNs via a pipeline of first search and then prune. Nevertheless, doing so often requires a search-train-prune-retrain process and thus prohibitive computational cost. In this paper, we discover for the first time that both efficient DNNs and their lottery subnetworks (i.e., lottery tickets) can be directly identified from a supernet, which we term as SuperTickets, via a two-in-one training scheme with jointly architecture searching and parameter pruning. Moreover, we develop a progressive and unified SuperTickets identification strategy that allows the connectivity of subnetworks to change during supernet training, achieving better accuracy and efficiency trade-offs than conventional sparse training. Finally, we evaluate whether such identified SuperTickets drawn from one task can transfer well to other tasks, validating their potential of handling multiple tasks simultaneously. Extensive experiments and ablation studies on three tasks and four benchmark datasets validate that our proposed SuperTickets achieve boosted accuracy and efficiency trade-offs than both typical NAS and pruning pipelines, regardless of having retraining or not. Codes and pretrained models are available at https://github.com/RICE-EIC/SuperTickets.

Problem

Research questions and friction points this paper is trying to address.

Identify efficient DNNs and lottery tickets from supernets.

Develop a unified strategy for architecture search and pruning.

Validate SuperTickets' transferability across multiple tasks.

Innovation

Methods, ideas, or system contributions that make the work stand out.

Jointly architecture search and parameter pruning

Progressive unified SuperTickets identification strategy

Task-agnostic SuperTickets transfer across multiple tasks

🔎 Similar Papers

Robust and Efficient Transfer Learning via Supernet Transfer in Warm-started Neural Architecture Search

2024-07-26arXiv.orgCitations: 0

Optimizing Time Series Forecasting Architectures: A Hierarchical Neural Architecture Search Approach

2024-06-07arXiv.orgCitations: 0

Nvidia

192,000 USD - 304,750 USD for Level 4, and 224,000 USD - 356,500 USD for Level 5

US, CA, Santa Clara / US, WA, Seattle

Research Scientist, AI & Systems Co-design (PhD)