Robust Classification of High-Dimensional Data using Data-Adaptive Energy Distance

📅 2023-06-24

🏛️ ECML/PKDD

📈 Citations: 1

✨ Influential: 0

🤖 AI Summary

To address the sensitivity of high-dimensional, low-sample-size (HDLSS) classification to noise, outliers, and distributional shift, this paper proposes a parameter-free, distribution-agnostic robust classifier. The method innovatively couples energy distance with local geometric structure for adaptive distance scaling; integrates data-driven kernel bandwidth selection, weighted energy distance computation, subspace sparsity-aware projection, and a distance-based nearest-neighbor framework. Crucially, it imposes no distributional assumptions—neither Gaussianity nor moment conditions—and significantly enhances discriminative power under small-sample and non-spherical cluster settings. Evaluated on 12 HDLSS benchmark datasets, it achieves an average accuracy improvement of 3.2% over state-of-the-art methods. It maintains ≥89% accuracy under 50% label noise and operates two orders of magnitude faster than deep learning baselines during inference.

Problem

Research questions and friction points this paper is trying to address.

Classification of high-dimensional low sample size data

Development of robust, parameter-free classifiers

Performance comparison with existing methods

Innovation

Methods, ideas, or system contributions that make the work stand out.

Data-Adaptive Energy Distance

Parameter-free Classifiers

HDLSS Data Robustness

🔎 Similar Papers

Improving Numerical Stability of Normalized Mutual Information Estimator on High Dimensions