š¤ AI Summary
The high computational and memory overhead of pretrained vision models hinders their practical deployment, while conventional pruning methods rely on downstream task dataāmaking them unsuitable for task-agnostic scenarios. Method: We systematically investigate structured pruning at initializationāprior to any task-specific fine-tuningādemonstrating that pruning can be performed solely on pretrained weights without access to any downstream data, preserving zero-shot generalization across unseen tasks. Subsequent lightweight fine-tuning fully restores accuracy on both original and retained tasks. Contribution/Results: We reveal that the smooth loss landscape induced by large-scale pretraining underpins cross-task knowledge transfer, and analyze pruning stability from a second-order optimization perspective. Experiments across diverse unseen tasks show that our approach achieves efficient model compression while maintaining strong zero-shot generalizationāestablishing a novel, task-agnostic paradigm for lightweight vision models.
š Abstract
The widespread availability of pre-trained vision models has enabled numerous deep learning applications through their transferable representations. However, their computational and storage costs often limit practical deployment. Pruning-at-Initialization has emerged as a promising approach to compress models before training, enabling efficient task-specific adaptation. While conventional wisdom suggests that effective pruning requires task-specific data, this creates a challenge when downstream tasks are unknown in advance. In this paper, we investigate how data influences the pruning of pre-trained vision models. Surprisingly, pruning on one task retains the model's zero-shot performance also on unseen tasks. Furthermore, fine-tuning these pruned models not only improves performance on original seen tasks but can recover held-out tasks' performance. We attribute this phenomenon to the favorable loss landscapes induced by extensive pre-training on large-scale datasets.