Can Less Precise Be More Reliable? A Systematic Evaluation of Quantization's Impact on CLIP Beyond Accuracy

📅 2025-09-25
📈 Citations: 0
✹ Influential: 0
📄 PDF
đŸ€– AI Summary
Prior work on CLIP quantization focuses predominantly on accuracy degradation, overlooking its impact on reliability aspects—such as calibration quality and out-of-distribution (OOD) detection. Method: We conduct a systematic evaluation of diverse quantization strategies—including quantization-aware training (QAT)—on CLIP’s reliability. We identify counterintuitive phenomena: quantization can improve calibration for underconfident models and even enhance OOD detection performance despite calibration deterioration. Building on these insights, we propose a tailored QAT framework explicitly optimizing for reliability. Contribution/Results: Extensive experiments demonstrate that quantization need not compromise reliability; under specific conditions, it simultaneously improves zero-shot classification accuracy, temperature-scaled expected calibration error (ECE), and OOD detection AUC. Our results challenge the conventional efficiency–performance trade-off assumption, establishing that quantization—when properly designed—can jointly enhance accuracy, calibration, and robustness in zero-shot vision–language learning.

Technology Category

Application Category

📝 Abstract
The powerful zero-shot generalization capabilities of vision-language models (VLMs) like CLIP have enabled new paradigms for safety-related tasks such as out-of-distribution (OOD) detection. However, additional aspects crucial for the computationally efficient and reliable deployment of CLIP are still overlooked. In particular, the impact of quantization on CLIP's performance beyond accuracy remains underexplored. This work presents a large-scale evaluation of quantization on CLIP models, assessing not only in-distribution accuracy but a comprehensive suite of reliability metrics and revealing counterintuitive results driven by pre-training source. We demonstrate that quantization consistently improves calibration for typically underconfident pre-trained models, while often degrading it for overconfident variants. Intriguingly, this degradation in calibration does not preclude gains in other reliability metrics; we find that OOD detection can still improve for these same poorly calibrated models. Furthermore, we identify specific quantization-aware training (QAT) methods that yield simultaneous gains in zero-shot accuracy, calibration, and OOD robustness, challenging the view of a strict efficiency-performance trade-off. These findings offer critical insights for navigating the multi-objective problem of deploying efficient, reliable, and robust VLMs by utilizing quantization beyond its conventional role.
Problem

Research questions and friction points this paper is trying to address.

Evaluating how quantization affects CLIP model reliability beyond accuracy
Assessing quantization's impact on calibration and out-of-distribution detection
Exploring quantization methods for efficient yet reliable vision-language models
Innovation

Methods, ideas, or system contributions that make the work stand out.

Evaluated quantization impact on CLIP reliability metrics
Quantization improves calibration for underconfident models
Quantization-aware training enhances accuracy and robustness
🔎 Similar Papers
No similar papers found.
A
Aymen Bouguerra
Université Paris-Saclay, CEA, List, F-91120, Palaiseau, France
D
Daniel Montoya
Université Paris-Saclay, CEA, List, F-91120, Palaiseau, France
Alexandra Gomez-Villa
Alexandra Gomez-Villa
Assistant Professor, Universitat AutĂČnoma de Barcelona & Researcher, Computer Vision Center
Computer visionMachine learningVisual perception
F
Fabio Arnez
Université Paris-Saclay, CEA, List, F-91120, Palaiseau, France
C
Chokri Mraidha
Université Paris-Saclay, CEA, List, F-91120, Palaiseau, France