Gram-Schmidt Methods for Unsupervised Feature Extraction and Selection

📅 2023-11-15

📈 Citations: 0

✨ Influential: 0

career value

208K/year

🤖 AI Summary

To address the challenge of unsupervised feature extraction and selection under nonlinear dependencies, this paper proposes the first Gram–Schmidt orthogonalization framework operating directly in function space. It constructs interpretable covariance sequences to adaptively identify high-variance directions while eliminating redundant dependencies. Theoretically, we extend Gram–Schmidt orthogonalization to reproducing kernel Hilbert spaces (RKHS) for the first time, derive an information-theoretically grounded entropy-decrease bound, and rigorously generalize Fourier-based feature selection—achieving significantly reduced computational complexity. Empirically, our method outperforms PCA, kernel PCA (KPCA), UMAP, and autoencoders on both linear and nonlinear benchmarks. In real-world feature selection tasks, it simultaneously improves accuracy and computational efficiency.

📝 Abstract

Feature extraction and selection at the presence of nonlinear dependencies among the data is a fundamental challenge in unsupervised learning. We propose using a Gram-Schmidt (GS) type orthogonalization process over function spaces to detect and map out such dependencies. Specifically, by applying the GS process over some family of functions, we construct a series of covariance matrices that can either be used to identify new large-variance directions, or to remove those dependencies from known directions. In the former case, we provide information-theoretic guarantees in terms of entropy reduction. In the latter, we provide precise conditions by which the chosen function family eliminates existing redundancy in the data. Each approach provides both a feature extraction and a feature selection algorithm. Our feature extraction methods are linear, and can be seen as natural generalization of principal component analysis (PCA). We provide experimental results for synthetic and real-world benchmark datasets which show superior performance over state-of-the-art (linear) feature extraction and selection algorithms. Surprisingly, our linear feature extraction algorithms are comparable and often outperform several important nonlinear feature extraction methods such as autoencoders, kernel PCA, and UMAP. Furthermore, one of our feature selection algorithms strictly generalizes a recent Fourier-based feature selection mechanism (Heidari et al., IEEE Transactions on Information Theory, 2022), yet at significantly reduced complexity.

Problem

Research questions and friction points this paper is trying to address.

Detect nonlinear dependencies in unsupervised data

Extract and select features using Gram-Schmidt orthogonalization

Improve performance over linear and nonlinear methods

Innovation

Methods, ideas, or system contributions that make the work stand out.

Gram-Schmidt orthogonalization over function spaces

Linear feature extraction generalizing PCA

Reduced complexity Fourier-based feature selection

🔎 Similar Papers

Unsupervised Machine Learning Hybrid Approach Integrating Linear Programming in Loss Function: A Robust Optimization Technique