Zero-Shot Quantization via Weight-Space Arithmetic

πŸ“… 2026-04-03
πŸ“ˆ Citations: 0
✨ Influential: 0
πŸ“„ PDF
πŸ€– AI Summary
This work addresses the limited robustness of post-training quantization (PTQ) in the absence of target-task data by proposing a zero-shot transfer method. It introduces "quantization vectors" extracted via arithmetic operations in weight space and transfers them from a source model to a target Vision Transformer (ViT). This approach demonstrates for the first time that quantization robustness corresponds to a transferable direction in weight space, enabling significant improvements in resilience to low-bit quantization noise without requiring quantization-aware training, fine-tuning, or any data from the target task. Experiments show that the method can enhance PTQ robustness on ViTs by up to 60%, establishing a novel paradigm for low-cost, zero-shot quantization.
πŸ“ Abstract
We show that robustness to post-training quantization (PTQ) is a transferable direction in weight space. We call this direction the quantization vector: extracted from a donor task by simple weight-space arithmetic, it can be used to patch a receiver model and improve robustness to PTQ-induced noise by as much as 60%, without receiver-side quantization-aware training (QAT). Because the method requires no receiver training data, it provides a zero-shot, low-cost alternative to QAT for extremely low-bit deployment. We demonstrate this on Vision Transformer (ViT) models. More broadly, our results suggest that quantization robustness is not merely a byproduct of task-specific training, but a reusable feature of weight-space geometry that can be transferred rather than retrained.
Problem

Research questions and friction points this paper is trying to address.

post-training quantization
quantization robustness
zero-shot
weight-space geometry
quantization-aware training
Innovation

Methods, ideas, or system contributions that make the work stand out.

zero-shot quantization
weight-space arithmetic
quantization vector
post-training quantization
quantization robustness
πŸ”Ž Similar Papers
No similar papers found.
D
Daniele Solombrino
Sapienza University of Rome
A
Antonio Andrea Gargiulo
Sapienza University of Rome
Adrian Robert Minut
Adrian Robert Minut
Phd Student, Sapienza University of Rome
Large Language ModelsModel MergingUncertainty Estimation
L
Luca Zhou
Sapienza University of Rome
A
Alessandro Zirilli
Sapienza University of Rome
Emanuele RodolΓ 
Emanuele RodolΓ 
Professor of Computer Science, Sapienza University of Rome
Machine LearningAudioGeometric Deep LearningGeometry ProcessingComputer Vision