Two Heads are Better Than One: Neural Networks Quantization with 2D Hilbert Curve-based Output Representation

📅 2024-05-22
🏛️ arXiv.org
📈 Citations: 0
Influential: 0
📄 PDF

career value

208K/year
🤖 AI Summary
To address the severe degradation in depth prediction accuracy caused by INT8 quantization of deep neural networks (DNNs) on resource-constrained devices, this paper proposes a retraining-free, geometry-aware quantization compensation method. It models depth map outputs as parameterized points on a 2D Hilbert curve and leverages the inherent geometric redundancy in the curve’s embedding space to suppress quantization errors. The method introduces only lightweight forward and inverse mapping operations at the output layer, ensuring compatibility with diverse architectures—including U-Net and Vision Transformers (ViTs). Evaluated on the Depth-From-Stereo task, it reduces depth estimation error by approximately 5× under CPU/DSP-side INT8 quantization, with less than 7% increase in inference latency. To our knowledge, this is the first work to exploit Hilbert-curve-induced spatial redundancy for enhancing quantization robustness in monocular or stereo depth estimation.

Technology Category

Application Category

📝 Abstract
Quantization is widely used to increase deep neural networks' (DNN) memory, computation, and power efficiency. Various techniques, such as post-training quantization and quantization-aware training, have been proposed to improve quantization quality. We introduce a novel approach for DNN quantization that uses a redundant representation of DNN's output. We represent the target quantity as a point on a 2D parametric curve. The DNN model is modified to predict 2D points that are mapped back to the target quantity at a post-processing stage. We demonstrate that this mapping can reduce quantization error. For the low-order parametric Hilbert curve, Depth-From-Stereo task, and two models represented by U-Net architecture and vision transformer, we achieved a quantization error reduction by about 5 times for the INT8 model at both CPU and DSP delegates. This gain comes with a minimal inference time increase (less than 7%). Our approach can be applied to other tasks, including segmentation, object detection, and key-points prediction.
Problem

Research questions and friction points this paper is trying to address.

Overcoming low-bit precision limitations for high dynamic range depth restoration
Enabling high-precision depth prediction on low-end devices using Hilbert curves
Reducing quantization errors while maintaining computational efficiency in depth prediction
Innovation

Methods, ideas, or system contributions that make the work stand out.

Uses 2D Hilbert curves for depth representation
Trains DNN to predict Hilbert curve components
Reconstructs depth from quantized components post-processing
M
Mykhailo Uss
Samsung R&D Institute Ukraine, Kyiv 01032, Ukraine
R
Ruslan Yermolenko
Samsung R&D Institute Ukraine, Kyiv 01032, Ukraine
O
Olena Kolodiazhna
Samsung R&D Institute Ukraine, Kyiv 01032, Ukraine
O
Oleksii Shashko
Samsung R&D Institute Ukraine, Kyiv 01032, Ukraine
I
Ivan Safonov
Samsung R&D Institute Ukraine, Kyiv 01032, Ukraine
V
Volodymyr Savin
Samsung R&D Institute Ukraine, Kyiv 01032, Ukraine
Y
Yoonjae Yeo
Samsung Research, Seoul 06765, Republic of Korea
S
Seowon Ji
Samsung Research, Seoul 06765, Republic of Korea
J
Jaeyun Jeong
Samsung Research, Seoul 06765, Republic of Korea