🤖 AI Summary
Evaluating representation learning paradigms—contrastive learning, masked particle modeling, and generative reconstruction—for foundational models in particle physics remains challenging due to inconsistent architectures, preprocessing, and evaluation protocols.
Method: We conduct a systematic, fair comparison under a unified framework featuring a shared Transformer-based particle-cloud encoder, standardized preprocessing, sampling, and evaluation on jet classification tasks. We further propose supervised architectural enhancements to the pretraining objectives.
Contribution/Results: Our approach achieves the first state-of-the-art (SOTA) performance on jet classification, demonstrating that supervised fine-tuning significantly boosts representation quality. We comprehensively analyze trade-offs among representational capacity, generalization, and computational efficiency across paradigms. Moreover, we establish the first reproducible, transparent, and robust benchmark for particle physics representation learning—providing critical empirical evidence and methodological guidance for principled foundation model design and advancement in high-energy physics.
📝 Abstract
We present a systematic evaluation of representation learning objectives for particle physics within a unified framework. Our study employs a shared transformer-based particle-cloud encoder with standardized preprocessing, matched sampling, and a consistent evaluation protocol on a jet classification dataset. We compare contrastive (supervised and self-supervised), masked particle modeling, and generative reconstruction objectives under a common training regimen. In addition, we introduce targeted supervised architectural modifications that achieve state-of-the-art performance on benchmark evaluations. This controlled comparison isolates the contributions of the learning objective, highlights their respective strengths and limitations, and provides reproducible baselines. We position this work as a reference point for the future development of foundation models in particle physics, enabling more transparent and robust progress across the community.