🤖 AI Summary
To address the large model size and slow inference of the Segment Anything Model (SAM) and its variants in practical deployment, this paper proposes Birkhoff—a training-data-free, general-purpose, lossless compression framework. Methodologically, Birkhoff introduces (1) Hyper-Compression, which losslessly maps high-dimensional parameters to low-dimensional scalar sequences, and (2) the HyperLinear operator, which tightly integrates decompression with matrix multiplication to eliminate redundant computation. Evaluated on 18 SAM variants, Birkhoff achieves an average 5.17× compression ratio with <1% performance degradation; per-model compression takes under 60 seconds, and inference speed is significantly accelerated. Birkhoff delivers deployment agility, model compactness, and cross-architecture generalizability—establishing a novel paradigm for lightweighting vision foundation models.
📝 Abstract
Due to the excellent performance in yielding high-quality, zero-shot segmentation, Segment Anything Model (SAM) and its variants have been widely applied in diverse scenarios such as healthcare and intelligent manufacturing. Therefore, effectively compressing SAMs has become an increasingly pressing practical need. In this study, we propose Birkhoff, a novel data-free compression algorithm for SAM and its variants. Unlike quantization, pruning, distillation, and other compression methods, Birkhoff embodies versatility across model types, agility in deployment, faithfulness to the original model, and compactness in model size. Specifically, Birkhoff introduces a novel compression algorithm: Hyper-Compression, whose core principle is to find a dense trajectory to turn a high-dimensional parameter vector into a low-dimensional scalar. Furthermore, Birkhoff designs a dedicated linear layer operator, HyperLinear, to fuse decompression and matrix multiplication to significantly accelerate inference of the compressed SAMs. Extensive experiments on 18 SAMs in the COCO, LVIS, and SA-1B datasets show that Birkhoff performs consistently and competitively in compression time, compression ratio, post-compression performance, and inference speed. For example, Birkhoff can achieve a compression ratio of 5.17x on SAM2-B, with less than 1% performance drop without using any fine-tuning data. Moreover, the compression is finished within 60 seconds for all models.