TinySAM: Pushing the Envelope for Efficient Segment Anything Model

📅 2023-12-21
🏛️ arXiv.org
📈 Citations: 8
Influential: 1
📄 PDF
🤖 AI Summary
To address the contradiction between resource-constrained edge devices and the high computational overhead of the Segment Anything Model (SAM), this paper proposes TinySAM—a lightweight, general-purpose segmentation model that preserves SAM’s zero-shot segmentation capability while enabling efficient on-device deployment. Methodologically, we introduce three novel techniques: (1) full-stage knowledge distillation incorporating hard prompt sampling and hard mask weighting; (2) post-training quantization specifically tailored for prompt-based segmentation; and (3) a hierarchical “segment-anything” acceleration strategy. Experiments demonstrate that TinySAM achieves a 2× speedup in inference latency and reduces computational cost by an order of magnitude, while maintaining near-lossless accuracy. Crucially, its zero-shot transfer performance significantly surpasses existing lightweight alternatives. To our knowledge, TinySAM is the first SAM-level general segmentation model capable of real-time operation on edge devices.
📝 Abstract
Recently segment anything model (SAM) has shown powerful segmentation capability and has drawn great attention in computer vision fields. Massive following works have developed various applications based on the pre-trained SAM and achieved impressive performance on downstream vision tasks. However, SAM consists of heavy architectures and requires massive computational capacity, which hinders the further application of SAM on computation constrained edge devices. To this end, in this paper we propose a framework to obtain a tiny segment anything model (TinySAM) while maintaining the strong zero-shot performance. We first propose a full-stage knowledge distillation method with hard prompt sampling and hard mask weighting strategy to distill a lightweight student model. We also adapt the post-training quantization to the prompt-based segmentation task and further reduce the computational cost. Moreover, a hierarchical segmenting everything strategy is proposed to accelerate the everything inference by $2 imes$ with almost no performance degradation. With all these proposed methods, our TinySAM leads to orders of magnitude computational reduction and pushes the envelope for efficient segment anything task. Extensive experiments on various zero-shot transfer tasks demonstrate the significantly advantageous performance of our TinySAM against counterpart methods. Codes are available at https://github.com/xinghaochen/TinySAM and https://gitee.com/mindspore/models/tree/master/research/cv/TinySAM.
Problem

Research questions and friction points this paper is trying to address.

Image Segmentation
Computational Efficiency
SAM Model Adaptation
Innovation

Methods, ideas, or system contributions that make the work stand out.

TinySAM
Hierarchical Segmentation Strategy
Low-Computational Demand