TAP-ViTs: Task-Adaptive Pruning for On-Device Deployment of Vision Transformers

📅 2026-01-05
🏛️ arXiv.org
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the challenge of deploying Vision Transformers (ViTs) on resource-constrained mobile and edge devices, where existing pruning methods struggle to balance device heterogeneity, task customization, and data privacy. The authors propose a task-adaptive pruning framework that generates customized lightweight ViT models for diverse devices without accessing local raw data. Central to this approach is a novel privacy-preserving task feature extraction mechanism based on Gaussian Mixture Models (GMMs), which leverages public data to construct task proxy sets. Fine-grained pruning is achieved through a dual-granularity joint assessment of both neuron- and layer-level importance. Extensive experiments demonstrate that the proposed method consistently outperforms state-of-the-art pruning techniques across various ViT architectures and datasets, achieving higher accuracy at equivalent compression ratios.

Technology Category

Application Category

📝 Abstract
Vision Transformers (ViTs) have demonstrated strong performance across a wide range of vision tasks, yet their substantial computational and memory demands hinder efficient deployment on resource-constrained mobile and edge devices. Pruning has emerged as a promising direction for reducing ViT complexity. However, existing approaches either (i) produce a single pruned model shared across all devices, ignoring device heterogeneity, or (ii) rely on fine-tuning with device-local data, which is often infeasible due to limited on-device resources and strict privacy constraints. As a result, current methods fall short of enabling task-customized ViT pruning in privacy-preserving mobile computing settings. This paper introduces TAP-ViTs, a novel task-adaptive pruning framework that generates device-specific pruned ViT models without requiring access to any raw local data. Specifically, to infer device-level task characteristics under privacy constraints, we propose a Gaussian Mixture Model (GMM)-based metric dataset construction mechanism. Each device fits a lightweight GMM to approximate its private data distribution and uploads only the GMM parameters. Using these parameters, the cloud selects distribution-consistent samples from public data to construct a task-representative metric dataset for each device. Based on this proxy dataset, we further develop a dual-granularity importance evaluation-based pruning strategy that jointly measures composite neuron importance and adaptive layer importance, enabling fine-grained, task-aware pruning tailored to each device's computational budget. Extensive experiments across multiple ViT backbones and datasets demonstrate that TAP-ViTs consistently outperforms state-of-the-art pruning methods under comparable compression ratios.
Problem

Research questions and friction points this paper is trying to address.

Vision Transformers
model pruning
on-device deployment
privacy-preserving
task-adaptive
Innovation

Methods, ideas, or system contributions that make the work stand out.

Task-Adaptive Pruning
Vision Transformers
Privacy-Preserving
Gaussian Mixture Model
On-Device Deployment
🔎 Similar Papers
No similar papers found.
Zhibo Wang
Zhibo Wang
Professor at College of Computer Science and Technology, Zhejiang University
Internet of ThingsAI SecurityData Security and Privacy
Z
Zuoyuan Zhang
State Key Laboratory of Blockchain and Data Security and the College of Computer Science and Technology, Zhejiang University, Hangzhou 310027, China
X
Xiaoyi Pang
State Key Laboratory of Blockchain and Data Security and the College of Computer Science and Technology, Zhejiang University, Hangzhou 310027, China
Q
Qile Zhang
State Key Laboratory of Blockchain and Data Security and the College of Computer Science and Technology, Zhejiang University, Hangzhou 310027, China
X
Xuanyi Hao
State Key Laboratory of Blockchain and Data Security and the College of Computer Science and Technology, Zhejiang University, Hangzhou 310027, China
S
Shuguo Zhuo
State Key Laboratory of Blockchain and Data Security and the College of Computer Science and Technology, Zhejiang University, Hangzhou 310027, China
P
Peng Sun
College of Computer Science and Electronic Engineering, Hunan University, Changsha 410082, China