Investigating Privacy Leakage in Dimensionality Reduction Methods via Reconstruction Attack

📅 2024-08-30

🏛️ Journal of Information Security and Applications

📈 Citations: 0

✨ Influential: 0

career value

207K/year

🤖 AI Summary

This work systematically uncovers an inherent privacy leakage risk in dimensionality reduction: adversaries can reconstruct high-dimensional original data from low-dimensional embeddings. To address this, we propose the first neural-network-based reconstruction attack framework specifically designed for dimensionality reduction embeddings, and evaluate six mainstream algorithms—PCA, SRP, MDS, Isomap, t-SNE, and UMAP—on MNIST and NIH ChestX-ray datasets. Results show that deterministic methods (e.g., PCA, Isomap) are highly vulnerable to high-fidelity reconstruction, posing severe privacy threats; in contrast, stochastic methods reliant on random initialization (e.g., t-SNE, UMAP) and certain linear random projections (SRP) exhibit intrinsic robustness against such attacks. Furthermore, additive noise effectively degrades reconstruction quality for PCA and Isomap but has negligible impact on stochastic methods. This study establishes the first systematic empirical benchmark and theoretical insights for privacy risk assessment of dimensionality reduction techniques.

Technology Category

Application Category

📝 Abstract

This study investigates privacy leakage in dimensionality reduction methods through a novel machine learning-based reconstruction attack. Employing an informed adversary threat model, we develop a neural network capable of reconstructing high-dimensional data from low-dimensional embeddings. We evaluate six popular dimensionality reduction techniques: PCA, sparse random projection (SRP), multidimensional scaling (MDS), Isomap, t-SNE, and UMAP. Using both MNIST and NIH Chest X-ray datasets, we perform a qualitative analysis to identify key factors affecting reconstruction quality. Furthermore, we assess the effectiveness of an additive noise mechanism in mitigating these reconstruction attacks. Our experimental results on both datasets reveal that the attack is effective against deterministic methods (PCA and Isomap), but ineffective against methods that employ random initialization (SRP, MDS, t-SNE and UMAP). When adding the images with large noises before performing PCA or Isomap, the attack produced severely distorted reconstructions. In contrast, for the other four methods, the reconstructions still show some recognizable features, though they bear little resemblance to the original images.

Problem

Research questions and friction points this paper is trying to address.

Investigating privacy risks in dimensionality reduction via reconstruction attacks

Evaluating six DR methods' vulnerability to data reconstruction

Assessing additive noise effectiveness in mitigating reconstruction attacks

Innovation

Methods, ideas, or system contributions that make the work stand out.

Neural network reconstructs high-dimensional data from embeddings

Evaluates six dimensionality reduction techniques for privacy leaks

Tests additive noise to mitigate reconstruction attack effectiveness

🔎 Similar Papers

Data Reconstruction Attacks and Defenses: A Systematic Evaluation