Test-Time Model Adaptation for Quantized Neural Networks

📅 2025-08-04
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Quantized neural networks suffer significant performance degradation under domain shifts in dynamic environments. To address this, we propose Zeroth-Order Adaptation (ZOA), a test-time adaptation framework that eliminates backpropagation and updates the model solely via two forward passes. ZOA incorporates a lightweight domain-knowledge management mechanism that enables cross-domain knowledge accumulation and interference suppression with minimal storage overhead. The framework is architecture-agnostic—compatible with both CNNs and Transformers—and supports low-bit quantized models (e.g., W6A6). On ImageNet-C, ZOA improves the average accuracy of a quantized ViT-B by 5.0% over the state-of-the-art FOA, markedly enhancing robustness and generalization. By enabling efficient, gradient-free online adaptation, ZOA establishes a practical new paradigm for resource-constrained deployment scenarios.

Technology Category

Application Category

📝 Abstract
Quantizing deep models prior to deployment is a widely adopted technique to speed up inference for various real-time applications, such as autonomous driving. However, quantized models often suffer from severe performance degradation in dynamic environments with potential domain shifts and this degradation is significantly more pronounced compared with their full-precision counterparts, as shown by our theoretical and empirical illustrations. To address the domain shift problem, test-time adaptation (TTA) has emerged as an effective solution by enabling models to learn adaptively from test data. Unfortunately, existing TTA methods are often impractical for quantized models as they typically rely on gradient backpropagation--an operation that is unsupported on quantized models due to vanishing gradients, as well as memory and latency constraints. In this paper, we focus on TTA for quantized models to improve their robustness and generalization ability efficiently. We propose a continual zeroth-order adaptation (ZOA) framework that enables efficient model adaptation using only two forward passes, eliminating the computational burden of existing methods. Moreover, we propose a domain knowledge management scheme to store and reuse different domain knowledge with negligible memory consumption, reducing the interference of different domain knowledge and fostering the knowledge accumulation during long-term adaptation. Experimental results on three classical architectures, including quantized transformer-based and CNN-based models, demonstrate the superiority of our methods for quantized model adaptation. On the quantized W6A6 ViT-B model, our ZOA is able to achieve a 5.0% improvement over the state-of-the-art FOA on ImageNet-C dataset. The source code is available at https://github.com/DengZeshuai/ZOA.
Problem

Research questions and friction points this paper is trying to address.

Address performance degradation in quantized neural networks under domain shifts
Enable test-time adaptation for quantized models without gradient backpropagation
Improve robustness and generalization of quantized models efficiently
Innovation

Methods, ideas, or system contributions that make the work stand out.

Continual zeroth-order adaptation for quantized models
Domain knowledge management with minimal memory
Two forward passes enable efficient adaptation
🔎 Similar Papers
No similar papers found.
Zeshuai Deng
Zeshuai Deng
South China University of Technology
computer vison
Guohao Chen
Guohao Chen
South China University of Technology
Transfer LearningDomain AdaptationTest-Time Adaptation
Shuaicheng Niu
Shuaicheng Niu
Nanyang Technological University
Machine LearningDomain AdaptationRobustnessAutoML
H
Hui Luo
Institute of Optics and Electronics, Chinese Academy of Sciences, Chengdu, China
Shuhai Zhang
Shuhai Zhang
华南理工大学
Computer VisionMachine Learning
Y
Yifan Yang
South China University of Technology, Guangzhou, China
R
Renjie Chen
South China University of Technology, Guangzhou, China
W
Wei Luo
South China Agricultural University, Guangzhou, China
Mingkui Tan
Mingkui Tan
South China University of Technology
Machine LearningLarge-scale Optimization