🤖 AI Summary
This work addresses the challenge of change-point detection in high-dimensional, low-sample-size (HDLSS) settings where heavy-tailed or contaminated data invalidate conventional moment-based methods. The authors propose a dimension-averaged scanning framework grounded in angular kernel statistics, which aggregates bounded one-dimensional angular deviations across coordinates to achieve a nonparametric, moment-free, and hyperparameter-free procedure. For the first time under HDLSS assumptions, they establish a mean decomposition and covariance structure for angular kernel statistics, enabling derivation of a multivariate central limit theorem that facilitates exact Gaussian calibration and precise localization guarantees without moment conditions. The method ensures controlled Type I error, achieves a local detection rate of order $d^{-1/2}$, and provides average run length (ARL) calibration with worst-case bounds on the expected detection delay (EDD) in both offline and streaming settings, significantly outperforming existing approaches in simulations.
📝 Abstract
We study change-point detection for high-dimensional data in regimes where inference must be performed from small batches of observations. Our primary focus is the high-dimensional, low sample size (HDLSS) regime, where the sequence length is fixed while the ambient dimension diverges. We propose a dimension-averaged angular kernel scan framework for detecting marginal distributional shifts. The statistic aggregates bounded one-dimensional angular discrepancies across coordinates, yielding a fully nonparametric, hyperparameter-free, and moment-agnostic estimator that remains well-defined without specifying, estimating, or assuming finite marginal moments, for example under heavy-tailed or contaminated distributions. For the offline single-change problem, we derive an exact population mean factorization into a universal deterministic shape function and a scalar signal factor, characterize the null covariance structure up to a scalar long-run variance factor, and establish an HDLSS multivariate central limit theorem under cross-coordinate mixing. These results lead to plug-in Gaussian calibration, asymptotic type-I error control, and power and localization guarantees, including a $d^{-1/2}$ local detection scale. We further extend the offline procedure to a fixed-window sequential monitoring procedure for high-dimensional streaming data, and obtain ARL calibration and worst-case EDD bounds. Simulation studies demonstrate that the proposed method can accurately detect and localize changes in challenging HDLSS and streaming settings where moment-based or hyperparameter-sensitive procedures may be unreliable.