Quantum-Inspired Optimization Process for Data Imputation

📅 2025-05-07

📈 Citations: 0

✨ Influential: 0

career value

209K/year

🤖 AI Summary

This work addresses the challenge of imputing biologically implausible missing values (e.g., zero-value artifacts in the UCI Diabetes dataset) in clinical data. We propose a gradient-free optimization framework that integrates Principal Component Analysis (PCA) with quantum-inspired state rotation. Crucially, state rotations are constrained within ±2 standard deviations—thereby avoiding overreliance on mean or median imputation and enabling statistically faithful reconstruction. Multiple optimizers—including COBYLA, simulated annealing, and differential evolution—are jointly employed to minimize distributional divergence, quantified via Wasserstein distance and Kolmogorov–Smirnov (KS) test statistics. Experiments demonstrate substantial improvements: average Wasserstein distance decreases by over 85%; KS p-values stabilize between 0.18–0.22—significantly lower than those (>0.99) achieved by conventional methods—indicating superior distributional fidelity. The approach markedly enhances clinical plausibility and variability modeling of imputed data.

Technology Category

Application Category

📝 Abstract

Data imputation is a critical step in data pre-processing, particularly for datasets with missing or unreliable values. This study introduces a novel quantum-inspired imputation framework evaluated on the UCI Diabetes dataset, which contains biologically implausible missing values across several clinical features. The method integrates Principal Component Analysis (PCA) with quantum-assisted rotations, optimized through gradient-free classical optimizers -COBYLA, Simulated Annealing, and Differential Evolution to reconstruct missing values while preserving statistical fidelity. Reconstructed values are constrained within +/-2 standard deviations of original feature distributions, avoiding unrealistic clustering around central tendencies. This approach achieves a substantial and statistically significant improvement, including an average reduction of over 85% in Wasserstein distance and Kolmogorov-Smirnov test p-values between 0.18 and 0.22, compared to p-values>0.99 in classical methods such as Mean, KNN, and MICE. The method also eliminates zero-value artifacts and enhances the realism and variability of imputed data. By combining quantum-inspired transformations with a scalable classical framework, this methodology provides a robust solution for imputation tasks in domains such as healthcare and AI pipelines, where data quality and integrity are crucial.

Problem

Research questions and friction points this paper is trying to address.

Develops quantum-inspired imputation for missing data in clinical datasets

Optimizes PCA with quantum rotations to preserve statistical fidelity

Reduces Wasserstein distance and avoids unrealistic imputed value clustering

Innovation

Methods, ideas, or system contributions that make the work stand out.

Quantum-inspired PCA with gradient-free optimizers

Constrained imputation within +/-2 standard deviations

Combines quantum rotations with classical framework

🔎 Similar Papers

No similar papers found.