🤖 AI Summary
This work addresses the challenge of traditional extrapolation methods failing in extreme regions due to data scarcity in the tails—a common issue in machine learning. To overcome this limitation, the authors propose a unified extreme-value extrapolation framework that integrates extreme value theory with statistical learning. Built upon asymptotic representations of univariate and multivariate tail distributions, the framework combines extreme value index estimation, tail distribution modeling, and dependence structure analysis. It is applicable to both supervised and unsupervised settings and accommodates both asymptotically dependent and independent data. Empirical evaluations demonstrate that the method substantially outperforms existing approaches in tasks such as extreme quantile regression, anomaly detection, and generative AI, yielding improved accuracy and robustness in predicting rare and extreme events.
📝 Abstract
Extreme value theory provides rigorous theory and statistical tools for extrapolation in machine learning, particularly in settings where traditional methods struggle due to data scarcity in the tails. A broad range of tasks benefit from these advances, including regression and classification beyond the training data, extreme quantile regression, supervised and unsupervised dimension reduction, generative artificial intelligence and anomaly detection. This review synthesizes recent developments in these fields at the intersection of statistical learning and extreme value theory, with a focus on principled methods based on asymptotically motivated representations of the tail of univariate and multivariate distributions. We consider different theoretical frameworks for both asymptotically dependent and independent data and discuss how they translate into efficient statistical methods for extrapolation to extreme regions. By addressing both theoretical and practical aspects, we offer a comprehensive overview of the state-of-the-art in this quickly evolving field, and identify promising directions for future research.