PTQ4RIS: Post-Training Quantization for Referring Image Segmentation

📅 2024-09-25
🏛️ arXiv.org
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address the challenge of efficiently deploying referring image segmentation (RIS) models on resource-constrained edge devices, this paper proposes the first lightweight post-training quantization (PTQ) framework specifically designed for RIS. To tackle the high sensitivity heterogeneity across multimodal encoders and the severe performance degradation under low-bit (4–8 bit) quantization, we introduce Dual-Region Quantization (DRQ) to adaptively assign distinct quantization strategies to the vision and language branches, and propose Reordering-based Outlier-Retaining Quantization (RORQ) to preserve critical channel-wise information. As the first PTQ method tailored for RIS, our approach achieves state-of-the-art performance at 4-bit precision on three mainstream benchmarks—significantly outperforming generic PTQ baselines—and demonstrates the feasibility of deploying RIS models on edge devices with both high accuracy and low computational overhead.

Technology Category

Application Category

📝 Abstract
Referring Image Segmentation (RIS), aims to segment the object referred by a given sentence in an image by understanding both visual and linguistic information. However, existing RIS methods tend to explore top-performance models, disregarding considerations for practical applications on resources-limited edge devices. This oversight poses a significant challenge for on-device RIS inference. To this end, we propose an effective and efficient post-training quantization framework termed PTQ4RIS. Specifically, we first conduct an in-depth analysis of the root causes of performance degradation in RIS model quantization and propose dual-region quantization (DRQ) and reorder-based outlier-retained quantization (RORQ) to address the quantization difficulties in visual and text encoders. Extensive experiments on three benchmarks with different bits settings (from 8 to 4 bits) demonstrates its superior performance. Importantly, we are the first PTQ method specifically designed for the RIS task, highlighting the feasibility of PTQ in RIS applications. Code and video are available at {https://github.com/gugu511yy/PTQ4RIS}.
Problem

Research questions and friction points this paper is trying to address.

Addressing quantization in RIS models
Optimizing for edge device constraints
Enhancing on-device inference efficiency
Innovation

Methods, ideas, or system contributions that make the work stand out.

Post-training quantization framework
Dual-region quantization technique
Reorder-based outlier-retained quantization
🔎 Similar Papers
No similar papers found.
X
Xiaoyan Jiang
School of Electronic and Electrical Engineering, Shanghai University of Engineering Science
H
Hang Yang
School of Electronic and Electrical Engineering, Shanghai University of Engineering Science
K
Kaiying Zhu
SenseTime
Xihe Qiu
Xihe Qiu
Associate Professor, Shanghai University of Engineering Science
AI for HealthcareVision-Language ModelsReinforcement LearningLarge Language Models
Shibo Zhao
Shibo Zhao
Carnegie Mellon University (superodometry.com)
3D Reconstruction |SLAM
Sifan Zhou
Sifan Zhou
Southeast University
RoboticsM/LLMsSpatial AIQuantization