🤖 AI Summary
Conventional quantization research assumes that quantization error directly predicts classification performance—a hypothesis lacking theoretical justification and contradicted by empirical evidence (e.g., binary/tri-level quantization often incurs large reconstruction error yet preserves or even improves accuracy). Method: This work departs from the error-minimization paradigm and instead directly evaluates the discriminative power of quantized features. Contribution/Results: We theoretically and empirically establish that {0,1} binary and {0,±1} ternary quantization can enhance—rather than degrade—feature separability, challenging the prevailing “lower error implies higher accuracy” assumption. Extensive experiments across multimodal data (images, speech, text) and multiple benchmark datasets demonstrate that our low-bit quantization methods achieve classification accuracy on par with or superior to full-precision models. These findings reveal that the fundamental impact of quantization on classification lies in discriminability reconfiguration—not reconstruction fidelity optimization.
📝 Abstract
In machine learning, quantization is widely used to simplify data representation and facilitate algorithm deployment on hardware. Given the fundamental role of classification in machine learning, it is crucial to investigate the impact of quantization on classification. Current research primarily focuses on quantization errors, operating under the premise that higher quantization errors generally result in lower classification performance. However, this premise lacks a solid theoretical foundation and often contradicts empirical findings. For instance, certain extremely low bit-width quantization methods, such as ${0,1}$-binary quantization and ${0, pm1}$-ternary quantization, can achieve comparable or even superior classification accuracy compared to the original non-quantized data, despite exhibiting high quantization errors. To more accurately evaluate classification performance, we propose to directly investigate the feature discrimination of quantized data, instead of analyzing its quantization error. Interestingly, it is found that both binary and ternary quantization methods can improve, rather than degrade, the feature discrimination of the original data. This remarkable performance is validated through classification experiments across various data types, including images, speech, and texts.