🤖 AI Summary
In high-dimensional settings, classical Linear Discriminant Analysis (LDA) suffers severe performance degradation due to unstable sample covariance matrix estimation. To address this, we propose a novel paradigm that directly optimizes the inverse covariance matrix: modeling it as a learnable parameter constrained via Cholesky decomposition to ensure positive definiteness, augmented with a low-rank expansion to improve estimability, and initialized via a multi-start strategy—including both the identity matrix and a warm start from classical LDA—to enhance convergence robustness. Crucially, our approach bypasses explicit covariance estimation, achieving a favorable trade-off between statistical stability and gradient-based optimization feasibility. Extensive experiments on multivariate simulations and real-world high-dimensional datasets demonstrate that our method significantly outperforms classical LDA and leading regularized variants (e.g., rLDA, sLDA) under small-sample conditions, delivering consistent improvements in classification accuracy and generalization robustness.
📝 Abstract
Linear discriminant analysis (LDA) is a fundamental method in statistical pattern recognition and classification, achieving Bayes optimality under Gaussian assumptions. However, it is well-known that classical LDA may struggle in high-dimensional settings due to instability in covariance estimation. In this work, we propose LDA with gradient optimization (LDA-GO), a new approach that directly optimizes the inverse covariance matrix via gradient descent. The algorithm parametrizes the inverse covariance matrix through Cholesky factorization, incorporates a low-rank extension to reduce computational complexity, and considers a multiple-initialization strategy, including identity initialization and warm-starting from the classical LDA estimates. The effectiveness of LDA-GO is demonstrated through extensive multivariate simulations and real-data experiments.