PTQ4RIS: Post-Training Quantization for Referring Image Segmentation

📅 2024-09-25

🏛️ arXiv.org

📈 Citations: 0

✨ Influential: 0

career value

186K/year

🤖 AI Summary

To address the challenge of efficiently deploying referring image segmentation (RIS) models on resource-constrained edge devices, this paper proposes the first lightweight post-training quantization (PTQ) framework specifically designed for RIS. To tackle the high sensitivity heterogeneity across multimodal encoders and the severe performance degradation under low-bit (4–8 bit) quantization, we introduce Dual-Region Quantization (DRQ) to adaptively assign distinct quantization strategies to the vision and language branches, and propose Reordering-based Outlier-Retaining Quantization (RORQ) to preserve critical channel-wise information. As the first PTQ method tailored for RIS, our approach achieves state-of-the-art performance at 4-bit precision on three mainstream benchmarks—significantly outperforming generic PTQ baselines—and demonstrates the feasibility of deploying RIS models on edge devices with both high accuracy and low computational overhead.

Technology Category

Application Category

📝 Abstract

Referring Image Segmentation (RIS), aims to segment the object referred by a given sentence in an image by understanding both visual and linguistic information. However, existing RIS methods tend to explore top-performance models, disregarding considerations for practical applications on resources-limited edge devices. This oversight poses a significant challenge for on-device RIS inference. To this end, we propose an effective and efficient post-training quantization framework termed PTQ4RIS. Specifically, we first conduct an in-depth analysis of the root causes of performance degradation in RIS model quantization and propose dual-region quantization (DRQ) and reorder-based outlier-retained quantization (RORQ) to address the quantization difficulties in visual and text encoders. Extensive experiments on three benchmarks with different bits settings (from 8 to 4 bits) demonstrates its superior performance. Importantly, we are the first PTQ method specifically designed for the RIS task, highlighting the feasibility of PTQ in RIS applications. Code and video are available at {https://github.com/gugu511yy/PTQ4RIS}.

Problem

Research questions and friction points this paper is trying to address.

Addressing quantization in RIS models

Optimizing for edge device constraints

Enhancing on-device inference efficiency

Innovation

Methods, ideas, or system contributions that make the work stand out.

Post-training quantization framework

Dual-region quantization technique

Reorder-based outlier-retained quantization

🔎 Similar Papers

No similar papers found.