Generalized Kernel Inducing Points by Duality Gap for Dataset Distillation

📅 2025-02-18
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Existing Kernel Inducing Points (KIP) methods are constrained by the squared loss, limiting their applicability to classification tasks that commonly employ non-quadratic convex losses such as cross-entropy or hinge loss. To address this limitation, we propose Dual-Gap Kernel Inducing Points (DGKIP), the first dataset distillation framework to incorporate convex optimization duality gap theory. DGKIP lifts the restriction on loss function form, enabling compatibility with arbitrary convex losses. Our method establishes a theoretical bound linking test error and prediction consistency, thereby connecting parameter perturbation bounds to kernel approximation quality. Experiments on MNIST and CIFAR-10 demonstrate that DGKIP preserves KIP’s computational efficiency while significantly broadening loss-function compatibility. Distilled models achieve stable accuracy and superior generalization performance across diverse convex losses—particularly cross-entropy—without sacrificing scalability or fidelity.

Technology Category

Application Category

📝 Abstract
We propose Duality Gap KIP (DGKIP), an extension of the Kernel Inducing Points (KIP) method for dataset distillation. While existing dataset distillation methods often rely on bi-level optimization, DGKIP eliminates the need for such optimization by leveraging duality theory in convex programming. The KIP method has been introduced as a way to avoid bi-level optimization; however, it is limited to the squared loss and does not support other loss functions (e.g., cross-entropy or hinge loss) that are more suitable for classification tasks. DGKIP addresses this limitation by exploiting an upper bound on parameter changes after dataset distillation using the duality gap, enabling its application to a wider range of loss functions. We also characterize theoretical properties of DGKIP by providing upper bounds on the test error and prediction consistency after dataset distillation. Experimental results on standard benchmarks such as MNIST and CIFAR-10 demonstrate that DGKIP retains the efficiency of KIP while offering broader applicability and robust performance.
Problem

Research questions and friction points this paper is trying to address.

Extends Kernel Inducing Points method
Eliminates bi-level optimization need
Supports diverse loss functions
Innovation

Methods, ideas, or system contributions that make the work stand out.

Leverages duality theory for optimization
Supports diverse loss functions effectively
Enhances dataset distillation efficiency broadly
🔎 Similar Papers
No similar papers found.
Tatsuya Aoyama
Tatsuya Aoyama
Research Scientist, Meta
language modelingpretraining dynamicsinterpretabilitycognitive science
H
Hanting Yang
Nagoya University, Nagoya, Japan
Hiroyuki Hanada
Hiroyuki Hanada
Nagoya University
machine learningmathematical optimization
S
Satoshi Akahane
Nagoya University, Nagoya, Japan
T
Tomonari Tanaka
Nagoya University, Nagoya, Japan
Y
Yoshito Okura
Nagoya University, Nagoya, Japan
Y
Yu Inatsu
Nagoya Institute of Technology, Nagoya, Japan
N
Noriaki Hashimoto
RIKEN, Nagoya, Japan
T
Taro Murayama
DENSO CORPORATION, Kariya, Japan
H
Hanju Lee
DENSO CORPORATION, Kariya, Japan
S
Shinya Kojima
DENSO CORPORATION, Kariya, Japan
I
Ichiro Takeuchi
Nagoya University, Nagoya, Japan; RIKEN, Nagoya, Japan