Statistical Learning Theory for Distributional Classification

📅 2026-01-21
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses supervised classification under a two-stage sampling setting where inputs are unobservable probability distributions accessible only through samples. The authors propose a Gaussian kernel support vector machine method based on kernel mean embeddings, which maps input distributions into a reproducing kernel Hilbert space for classification. By introducing a novel oracle inequality and a tailored noise condition, they establish theoretical guarantees of consistency and derive explicit learning rates within this framework. A new feature space specifically adapted to the Gaussian kernel is also constructed. Theoretical analysis demonstrates that the proposed method achieves optimal statistical convergence rates under the hinge loss, thereby confirming its effectiveness and superiority in distribution-input classification tasks.

Technology Category

Application Category

📝 Abstract
In supervised learning with distributional inputs in the two-stage sampling setup, relevant to applications like learning-based medical screening or causal learning, the inputs (which are probability distributions) are not accessible in the learning phase, but only samples thereof. This problem is particularly amenable to kernel-based learning methods, where the distributions or samples are first embedded into a Hilbert space, often using kernel mean embeddings (KMEs), and then a standard kernel method like Support Vector Machines (SVMs) is applied, using a kernel defined on the embedding Hilbert space. In this work, we contribute to the theoretical analysis of this latter approach, with a particular focus on classification with distributional inputs using SVMs. We establish a new oracle inequality and derive consistency and learning rate results. Furthermore, for SVMs using the hinge loss and Gaussian kernels, we formulate a novel variant of an established noise assumption from the binary classification literature, under which we can establish learning rates. Finally, some of our technical tools like a new feature space for Gaussian kernels on Hilbert spaces are of independent interest.
Problem

Research questions and friction points this paper is trying to address.

distributional classification
two-stage sampling
kernel mean embeddings
statistical learning theory
support vector machines
Innovation

Methods, ideas, or system contributions that make the work stand out.

distributional classification
kernel mean embedding
support vector machines
learning rates
Gaussian kernels on Hilbert spaces
🔎 Similar Papers
No similar papers found.