๐ค AI Summary
This work uncovers an intrinsic unification between distance-based $k$-medoids clustering and probability-density-matching vector quantization (VQ) via kernel density estimation (KDE), both formulated within the quadratic unconstrained binary optimization (QUBO) framework. Methodologically, we cast both paradigms as QUBO problems and rigorously proveโ for the first timeโthat the KDE-QUBO formulation is a special case of the $k$-medoids-QUBO problem under a kernel-induced feature mapping. This equivalence is characterized jointly by the maximum mean discrepancy (MMD) and kernel metrics, thereby establishing a structural bridge between distance-driven and distribution-matching VQ. Furthermore, we provide a geometric interpretation of the weighting parameters in VQ, substantially enhancing model interpretability. Collectively, these results yield a unified theoretical foundation for designing efficient, provably optimal hard quantization algorithms.
๐ Abstract
Vector Quantization (VQ) is a widely used technique in machine learning and data compression, valued for its simplicity and interpretability. Among hard VQ methods, $k$-medoids clustering and Kernel Density Estimation (KDE) approaches represent two prominent yet seemingly unrelated paradigms -- one distance-based, the other rooted in probability density matching. In this paper, we investigate their connection through the lens of Quadratic Unconstrained Binary Optimization (QUBO). We compare a heuristic QUBO formulation for $k$-medoids, which balances centrality and diversity, with a principled QUBO derived from minimizing Maximum Mean Discrepancy in KDE-based VQ. Surprisingly, we show that the KDE-QUBO is a special case of the $k$-medoids-QUBO under mild assumptions on the kernel's feature map. This reveals a deeper structural relationship between these two approaches and provides new insight into the geometric interpretation of the weighting parameters used in QUBO formulations for VQ.