Influence of Data Dimensionality Reduction Methods on the Effectiveness of Quantum Machine Learning Models

📅 2025-11-05

📈 Citations: 0

✨ Influential: 0

career value

190K/year

🤖 AI Summary

This study identifies a systematic bias introduced by dimensionality reduction in quantum machine learning (QML) evaluation. To mitigate resource constraints on NISQ devices and classical simulation bottlenecks, existing works routinely apply dimensionality reduction as a preprocessing step—yet its confounding effect on performance assessment remains underrecognized. Through large-scale comparative experiments—spanning synthetic and real-world datasets, classical methods (e.g., PCA), quantum encoding schemes (amplitude and angle encoding), diverse ansatz architectures, and mainstream quantum classifiers—we observe accuracy and F1-score fluctuations of 14–48% attributable to reduction. Critically, the bias magnitude arises from the coupled influence of data characteristics, encoding strategy, and circuit structure. We establish, for the first time, that dimensionality reduction is not a neutral preprocessing step but a critical confounder inducing erroneous model efficacy judgments; moreover, implicit compatibility among specific reduction–encoding–circuit combinations further distorts performance attribution. Our findings provide methodological warnings and practical benchmarks for fair QML evaluation.

Technology Category

Application Category

📝 Abstract

Data dimensionality reduction techniques are often utilized in the implementation of Quantum Machine Learning models to address two significant issues: the constraints of NISQ quantum devices, which are characterized by noise and a limited number of qubits, and the challenge of simulating a large number of qubits on classical devices. It also raises concerns over the scalability of these approaches, as dimensionality reduction methods are slow to adapt to large datasets. In this article, we analyze how data reduction methods affect different QML models. We conduct this experiment over several generated datasets, quantum machine algorithms, quantum data encoding methods, and data reduction methods. All these models were evaluated on the performance metrics like accuracy, precision, recall, and F1 score. Our findings have led us to conclude that the usage of data dimensionality reduction methods results in skewed performance metric values, which results in wrongly estimating the actual performance of quantum machine learning models. There are several factors, along with data dimensionality reduction methods, that worsen this problem, such as characteristics of the datasets, classical to quantum information embedding methods, percentage of feature reduction, classical components associated with quantum models, and structure of quantum machine learning models. We consistently observed the difference in the accuracy range of 14% to 48% amongst these models, using data reduction and not using it. Apart from this, our observations have shown that some data reduction methods tend to perform better for some specific data embedding methodologies and ansatz constructions.

Problem

Research questions and friction points this paper is trying to address.

Evaluating how data dimensionality reduction affects quantum machine learning model performance metrics

Addressing scalability issues of quantum models on noisy limited-qubit devices through feature reduction

Analyzing interactions between reduction methods, data embedding, and quantum model structures

Innovation

Methods, ideas, or system contributions that make the work stand out.

Evaluating data dimensionality reduction in quantum machine learning

Analyzing performance impact across multiple quantum algorithms

Identifying skewed metrics from feature reduction methods

🔎 Similar Papers

A Survey on Quantum Machine Learning: Current Trends, Challenges, Opportunities, and the Road Ahead

2023-10-16arXiv.orgCitations: 19

Bosch Group

Renningen, BW, DE

Authors to Follow