🤖 AI Summary
In high-energy physics and related domains, signal-background classification tasks suffer from insufficient modeling of physical correlations among input variables. To address this, we propose Random Distributional Shuffling Attack (RDSA), a novel adversarial perturbation method targeting the *correlation structure*—rather than individual features—of input data. RDSA is the first approach to shift the adversarial objective from feature space to *correlation space*, integrating prior scientific constraints to guide both data augmentation and adversarial training, thereby establishing a correlation-aware learning paradigm. Evaluated across six cross-domain benchmarks—including CERN Open Data, MNIST, and HAR—RDSA consistently enhances model generalization and robustness: under low-data and distribution-shift scenarios, it improves classification accuracy by 3.2–7.8% on average. These results empirically validate the effectiveness and broad applicability of explicitly incorporating variable correlation constraints into machine learning models.
📝 Abstract
Correlations between input parameters play a crucial role in many scientific classification tasks, since these are often related to fundamental laws of nature. For example, in high energy physics, one of the common deep learning use-cases is the classification of signal and background processes in particle collisions. In many such cases, the fundamental principles of the correlations between observables are often better understood than the actual distributions of the observables themselves. In this work, we present a new adversarial attack algorithm called Random Distribution Shuffle Attack (RDSA), emphasizing the correlations between observables in the network rather than individual feature characteristics. Correct application of the proposed novel attack can result in a significant improvement in classification performance - particularly in the context of data augmentation - when using the generated adversaries within adversarial training. Given that correlations between input features are also crucial in many other disciplines. We demonstrate the RDSA effectiveness on six classification tasks, including two particle collision challenges (using CERN Open Data), hand-written digit recognition (MNIST784), human activity recognition (HAR), weather forecasting (Rain in Australia), and ICU patient mortality (MIMIC-IV), demonstrating a general use case beyond fundamental physics for this new type of adversarial attack algorithms.