Rethinking PCA Through Duality

📅 2025-10-20

📈 Citations: 0

✨ Influential: 0

career value

237K/year

🤖 AI Summary

This paper addresses the theoretical disconnect between classical PCA and modern representation learning by introducing a novel perspective grounded in Difference-of-Convex (DC) optimization. Methodologically, it: (i) interprets synchronous iterative PCA algorithms as concrete instances of DC programming; (ii) establishes a rigorous theoretical link between PCA and self-attention mechanisms; (iii) formulates kernel PCA within the DC framework, enabling out-of-sample extension; and (iv) derives the dual formulation of ℓ₁-norm-based robust PCA. The contributions are threefold: (i) unifying the optimization principles underlying QR iteration, kernel methods, and self-attention; (ii) yielding a family of new PCA variants that are kernelizable, scalable, and robust; and (iii) providing rigorous theoretical analysis and empirical validation—numerical experiments demonstrate competitive or superior performance versus state-of-the-art methods. Collectively, this work furnishes a principled, interpretable, and generalizable optimization framework for classical dimensionality reduction.

Technology Category

Application Category

📝 Abstract

Motivated by the recently shown connection between self-attention and (kernel) principal component analysis (PCA), we revisit the fundamentals of PCA. Using the difference-of-convex (DC) framework, we present several novel formulations and provide new theoretical insights. In particular, we show the kernelizability and out-of-sample applicability for a PCA-like family of problems. Moreover, we uncover that simultaneous iteration, which is connected to the classical QR algorithm, is an instance of the difference-of-convex algorithm (DCA), offering an optimization perspective on this longstanding method. Further, we describe new algorithms for PCA and empirically compare them with state-of-the-art methods. Lastly, we introduce a kernelizable dual formulation for a robust variant of PCA that minimizes the $l_1$ deviation of the reconstruction errors.

Problem

Research questions and friction points this paper is trying to address.

Revisiting PCA fundamentals using duality and difference-of-convex framework

Developing kernelizable formulations with out-of-sample applicability

Creating robust PCA variants minimizing l1 reconstruction errors

Innovation

Methods, ideas, or system contributions that make the work stand out.

Uses difference-of-convex framework for PCA formulations

Introduces kernelizable dual for robust l1 PCA

Links simultaneous iteration to difference-of-convex algorithm

🔎 Similar Papers

Towards One Model for Classical Dimensionality Reduction: A Probabilistic Perspective on UMAP and t-SNE