Scalable and consistent embedding of probability measures into Hilbert spaces via measure quantization

📅 2025-02-07
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Efficiently embedding probability measures into Hilbert spaces remains computationally prohibitive at scale due to the high cost of linearized optimal transport (LOT) and kernel mean embeddings (KME). Method: We propose a low-support discrete approximation framework based on measure quantization, bypassing LOT and KME while preserving geometric structure. Contribution/Results: We establish, for the first time, the statistical consistency of measure quantization approximations, theoretically guaranteeing scalability and structural fidelity in Hilbert space embedding. The method unifies optimal transport, measure quantization, and kernel methods, balancing theoretical rigor with practical deployability. Experiments demonstrate controllable embedding error, 10–100× speedup in computation, and no degradation in downstream learning performance.

Technology Category

Application Category

📝 Abstract
This paper is focused on statistical learning from data that come as probability measures. In this setting, popular approaches consist in embedding such data into a Hilbert space with either Linearized Optimal Transport or Kernel Mean Embedding. However, the cost of computing such embeddings prohibits their direct use in large-scale settings. We study two methods based on measure quantization for approximating input probability measures with discrete measures of small-support size. The first one is based on optimal quantization of each input measure, while the second one relies on mean-measure quantization. We study the consistency of such approximations, and its implication for scalable embeddings of probability measures into a Hilbert space at a low computational cost. We finally illustrate our findings with various numerical experiments.
Problem

Research questions and friction points this paper is trying to address.

Embed probability measures into Hilbert spaces.
Reduce computational cost for large-scale data.
Ensure consistency in measure quantization methods.
Innovation

Methods, ideas, or system contributions that make the work stand out.

Measure quantization techniques
Low-cost Hilbert space embeddings
Scalable statistical learning methods
🔎 Similar Papers
No similar papers found.