Deep Support Vectors

📅 2024-03-26
🏛️ Neural Information Processing Systems
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Deep learning suffers from poor generalization under few-shot settings and lacks interpretability in decision-making, while SVMs offer theoretical interpretability and sample efficiency—properties not yet effectively integrated into deep models. This paper introduces DeepKKT, the first framework extending Karush–Kuhn–Tucker (KKT) conditions to deep neural networks, enabling the formal definition and identification of Deep Support Vectors (DSVs): a sparse set of critical training samples that dominantly shape the classification boundary. Leveraging DSVs, we propose a few-shot knowledge distillation mechanism and an implicit class-conditional generative model—where class labels serve as latent variables—thereby transforming discriminative models into high-fidelity, controllable generative models. Experiments on ImageNet, CIFAR-10/100, and architectures including ResNet and ConvNet demonstrate substantial improvements in few-shot generalization, enable post-hoc interpretable decision analysis via DSV attribution, and generate high-quality, class-specific images.

Technology Category

Application Category

📝 Abstract
Deep learning has achieved tremendous success. j{However,} unlike SVMs, which provide direct decision criteria and can be trained with a small dataset, it still has significant weaknesses due to its requirement for massive datasets during training and the black-box characteristics on decision criteria. j{This paper addresses} these issues by identifying support vectors in deep learning models. To this end, we propose the DeepKKT condition, an adaptation of the traditional Karush-Kuhn-Tucker (KKT) condition for deep learning models, and confirm that generated Deep Support Vectors (DSVs) using this condition exhibit properties similar to traditional support vectors. This allows us to apply our method to few-shot dataset distillation problems and alleviate the black-box characteristics of deep learning models. Additionally, we demonstrate that the DeepKKT condition can transform conventional classification models into generative models with high fidelity, particularly as latent jh{generative} models using class labels as latent variables. We validate the effectiveness of DSVs j{using common datasets (ImageNet, CIFAR10 j{and} CIFAR100) on the general architectures (ResNet and ConvNet)}, proving their practical applicability. (See Fig.~ ef{fig:generated})
Problem

Research questions and friction points this paper is trying to address.

Identify support vectors in deep learning models
Address black-box characteristics of deep learning
Enable few-shot learning and dataset distillation
Innovation

Methods, ideas, or system contributions that make the work stand out.

Adapts KKT condition for deep learning models
Identifies support vectors in deep networks
Transforms classifiers into generative models
🔎 Similar Papers
No similar papers found.