🤖 AI Summary
Quantum Support Vector Machines (QSVMs) suffer from poor scalability due to limitations in high-dimensional quantum state representation and current hardware constraints. To address this, we propose an embedding-aware quantum-classical SVM framework: robust features are extracted via a pretrained Vision Transformer (ViT), followed by class-balanced k-means distillation for data compression, and classical front-end optimization to enhance embedding quality. Theoretically and empirically, we reveal an intrinsic synergy between ViT’s attention mechanism and quantum feature maps—demonstrating, for the first time, that embedding selection is the critical factor enabling quantum kernel advantages. Leveraging cuTensorNet, we simulate 16-qubit tensor networks, achieving accuracy improvements of 8.02% on Fashion-MNIST and 4.42% on MNIST over baseline QSVMs. This work establishes a novel, scalable paradigm for quantum machine learning, supported by rigorous theoretical analysis and empirical validation.
📝 Abstract
Quantum Support Vector Machines face scalability challenges due to high-dimensional quantum states and hardware limitations. We propose an embedding-aware quantum-classical pipeline combining class-balanced k-means distillation with pretrained Vision Transformer embeddings. Our key finding: ViT embeddings uniquely enable quantum advantage, achieving up to 8.02% accuracy improvements over classical SVMs on Fashion-MNIST and 4.42% on MNIST, while CNN features show performance degradation. Using 16-qubit tensor network simulation via cuTensorNet, we provide the first systematic evidence that quantum kernel advantage depends critically on embedding choice, revealing fundamental synergy between transformer attention and quantum feature spaces. This provides a practical pathway for scalable quantum machine learning that leverages modern neural architectures.