Compander-Aligned Query Geometry for Quantized Zeroth-Order Optimization

📅 2026-05-11
📈 Citations: 0
Influential: 0
📄 PDF

career value

196K/year
🤖 AI Summary
This work addresses the failure of conventional zeroth-order (ZO) optimization methods in low-bit forward evaluation, where quantization-induced geometric distortion of queries degrades performance. The authors propose CAQ-ZO, which introduces the notion of query geometry for the first time and models non-uniform quantization via a compression function φ. By constructing Rademacher random templates in the compressed domain and mapping them back to the original space for gradient estimation, CAQ-ZO precisely aligns with the quantization grid, theoretically eliminating rounding residuals during querying. Experiments demonstrate that the method effectively removes residual channels in synthetic tasks and significantly outperforms existing ZO baselines under the same evaluation budget when fine-tuning NF4-quantized Qwen and Llama models, thereby overcoming the performance bottleneck of low-bit ZO optimization.
📝 Abstract
Low-bit forward evaluation is an attractive route to memory-efficient zeroth-order (ZO) adaptation: the optimizer needs only scalar losses, and the model can be queried near deployment precision. The obstacle is that a quantized ZO query is not a continuous finite difference followed by harmless storage rounding. The query chooses endpoints, the low-precision engine rounds them, and the loss difference is measured along the rounded chord. For nonuniform companding quantizers, this makes the codebook insufficient to predict ZO behavior: a fixed weight-space radius can collapse in dense cells, over-span sparse cells, or assign a rounded chord to an unrounded update direction. We identify the missing object as query geometry and model scalar nonuniform quantization as $Q = φ^{-1} \circ U \circ φ$. CAQ-ZO (Compander-Aligned Queries for Zeroth-Order Optimization) forms one-grid-step Rademacher stencils $z \pm Δr$ in $z = φ(x)$, maps endpoints back through $φ^{-1}$, and updates in $z$. Our theory proves the grid-span mismatch, decomposes endpoint-rounding estimator residuals, and gives stationarity bounds in which generic off-grid queries retain a $Δ^2/μ^2$ residual channel while CAQ-ZO makes the query-time residual exactly zero. Synthetic experiments isolate this channel, and matched NF4 Qwen/Llama fine-tuning shows that CAQ-ZO improves the trained NF4 baseline under the same quantizer and evaluation budget.
Problem

Research questions and friction points this paper is trying to address.

zeroth-order optimization
quantization
compander
query geometry
low-bit evaluation
Innovation

Methods, ideas, or system contributions that make the work stand out.

zeroth-order optimization
nonuniform quantization
compander-aligned query
query geometry
low-bit fine-tuning
🔎 Similar Papers
No similar papers found.