🤖 AI Summary
Assessing the statistical fidelity of generative models for LHC simulations beyond their training regime is challenging, particularly in estimating event amplification factors in sparse phase-space regions without compromising resolution or relying on large held-out datasets.
Method: We propose a data-efficient quantification framework comprising two complementary strategies: (i) integral amplification estimation based on integrated precision metrics, and (ii) differential amplification assessment grounded in rigorous hypothesis testing. Our approach integrates Bayesian modeling, ensemble inference, and statistically principled calibration, validated across diverse phase-space regions using state-of-the-art physics-informed generators.
Contribution/Results: Experiments demonstrate high-fidelity amplification (>100×) in critical subregions, though global coverage remains limited. This work establishes the first interpretable, verifiable, and held-out-data-free framework for quantifying generative amplification capability—enabling accelerated, statistically robust high-energy physics simulation.
📝 Abstract
Generative networks are perfect tools to enhance the speed and precision of LHC simulations. It is important to understand their statistical precision, especially when generating events beyond the size of the training dataset. We present two complementary methods to estimate the amplification factor without large holdout datasets. Averaging amplification uses Bayesian networks or ensembling to estimate amplification from the precision of integrals over given phase-space volumes. Differential amplification uses hypothesis testing to quantify amplification without any resolution loss. Applied to state-of-the-art event generators, both methods indicate that amplification is possible in specific regions of phase space, but not yet across the entire distribution.