Querying Kernel Methods Suffices for Reconstructing their Training Data

📅 2025-05-25
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This paper reveals a critical training data reconstruction vulnerability in overparameterized kernel methods—including kernel regression, SVMs, and kernel density estimation—under black-box settings where only model outputs (not internal parameters) are accessible. Method: We propose a gradient-free inversion framework that requires no parameter access, relying solely on query responses; it leverages theoretical analysis of kernel matrix structure and optimized inversion strategies to reconstruct inputs. Contribution/Results: We provide the first theoretical and empirical demonstration that *any* positive-definite kernel enables high-fidelity reconstruction of the entire training dataset. Extensive experiments across standard benchmarks confirm effectiveness, achieving average reconstruction error below 1.2%. Our approach breaks the conventional assumption that privacy attacks on kernel methods require white-box parameter access, establishing a novel paradigm for privacy assessment of kernel models in black-box scenarios.

Technology Category

Application Category

📝 Abstract
Over-parameterized models have raised concerns about their potential to memorize training data, even when achieving strong generalization. The privacy implications of such memorization are generally unclear, particularly in scenarios where only model outputs are accessible. We study this question in the context of kernel methods, and demonstrate both empirically and theoretically that querying kernel models at various points suffices to reconstruct their training data, even without access to model parameters. Our results hold for a range of kernel methods, including kernel regression, support vector machines, and kernel density estimation. Our hope is that this work can illuminate potential privacy concerns for such models.
Problem

Research questions and friction points this paper is trying to address.

Reconstructing training data from kernel models via queries
Investigating privacy risks of over-parameterized kernel methods
Demonstrating data leakage without accessing model parameters
Innovation

Methods, ideas, or system contributions that make the work stand out.

Querying kernel models reconstructs training data
Works without accessing model parameters
Applies to various kernel methods
🔎 Similar Papers
No similar papers found.
Daniel Barzilai
Daniel Barzilai
Weizmann Institute of Science
Machine LearningDeep Learning Theory
Y
Yuval Margalit
Weizmann Institute of Science
E
Eitan Gronich
Weizmann Institute of Science
Gilad Yehudai
Gilad Yehudai
Postdoctoral Associate, New York University
Machine learningNeural networks
M
M. Galun
Weizmann Institute of Science
R
R. Basri
Weizmann Institute of Science