🤖 AI Summary
Existing physics-driven validation methods struggle to comprehensively capture the latent discrepancies between simulation and real-world data, leading to hidden vulnerabilities in deep learning models deployed in critical domains such as high-energy physics. This work proposes CONSERVAttack, a novel adversarial attack framework that integrates physical uncertainty constraints to generate perturbations lying within known physical boundaries yet capable of evading conventional validation protocols. By formulating the attack as a constrained optimization problem, CONSERVAttack effectively exposes model weaknesses under distributional shifts while preserving consistency in control-region distributions. Notably, this approach represents the first integration of adversarial attacks with experimental uncertainty modeling, revealing fundamental limitations in current validation mechanisms and motivating targeted strategies for enhancing model robustness.
📝 Abstract
In High Energy Physics, as in many other fields of science, the application of machine learning techniques has been crucial in advancing our understanding of fundamental phenomena. Increasingly, deep learning models are applied to analyze both simulated and experimental data. In most experiments, a rigorous regime of testing for physically motivated systematic uncertainties is in place. The numerical evaluation of these tests for differences between the data on the one side and simulations on the other side quantifies the effect of potential sources of mismodelling on the machine learning output. In addition, thorough comparisons of marginal distributions and (linear) feature correlations between data and simulation in "control regions" are applied. However, the guidance by physical motivation, and the need to constrain comparisons to specific regions, does not guarantee that all possible sources of deviations have been accounted for. We therefore propose a new adversarial attack - the CONSERVAttack - designed to exploit the remaining space of hypothetical deviations between simulation and data after the above mentioned tests. The resulting adversarial perturbations are consistent within the uncertainty bounds - evading standard validation checks - while successfully fooling the underlying model. We further propose strategies to mitigate such vulnerabilities and argue that robustness to adversarial effects must be considered when interpreting results from deep learning in particle physics.