🤖 AI Summary
To address the critical limitation of kernel logistic regression (KLR)—its inherent lack of sparsity and the resulting trade-off between prediction accuracy and model interpretability—this paper proposes a novel sparse binary KLR modeling framework. Methodologically, it extends Keerthi et al.’s original dual formulation to a provably convergent, sparsity-inducing variant and develops a second-order information-enhanced SMO decomposition algorithm with global convergence guarantees. By unifying kernel methods, logistic regression, and sparse optimization, the framework achieves optimal accuracy–sparsity trade-offs across 12 benchmark datasets. Empirical results demonstrate significant improvements over state-of-the-art baselines—including the Informative Vector Machine (IVM), ℓ₁/₂-regularized KLR, and SVM—while fully preserving KLR’s probabilistic output capability.
📝 Abstract
Kernel logistic regression (KLR) is a widely used supervised learning method for binary and multi-class classification, which provides estimates of the conditional probabilities of class membership for the data points. Unlike other kernel methods such as Support Vector Machines (SVMs), KLRs are generally not sparse. Previous attempts to deal with sparsity in KLR include a heuristic method referred to as the Import Vector Machine (IVM) and ad hoc regularizations such as the $ell_{1/2}$-based one. Achieving a good trade-off between prediction accuracy and sparsity is still a challenging issue with a potential significant impact from the application point of view. In this work, we revisit binary KLR and propose an extension of the training formulation proposed by Keerthi et al., which is able to induce sparsity in the trained model, while maintaining good testing accuracy. To efficiently solve the dual of this formulation, we devise a decomposition algorithm of Sequential Minimal Optimization type which exploits second-order information, and for which we establish global convergence. Numerical experiments conducted on 12 datasets from the literature show that the proposed binary KLR approach achieves a competitive trade-off between accuracy and sparsity with respect to IVM, $ell_{1/2}$-based regularization for KLR, and SVM while retaining the advantages of providing informative estimates of the class membership probabilities.