An Evaluation of Representation Learning Methods in Particle Physics Foundation Models

📅 2025-11-16
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Evaluating representation learning paradigms—contrastive learning, masked particle modeling, and generative reconstruction—for foundational models in particle physics remains challenging due to inconsistent architectures, preprocessing, and evaluation protocols. Method: We conduct a systematic, fair comparison under a unified framework featuring a shared Transformer-based particle-cloud encoder, standardized preprocessing, sampling, and evaluation on jet classification tasks. We further propose supervised architectural enhancements to the pretraining objectives. Contribution/Results: Our approach achieves the first state-of-the-art (SOTA) performance on jet classification, demonstrating that supervised fine-tuning significantly boosts representation quality. We comprehensively analyze trade-offs among representational capacity, generalization, and computational efficiency across paradigms. Moreover, we establish the first reproducible, transparent, and robust benchmark for particle physics representation learning—providing critical empirical evidence and methodological guidance for principled foundation model design and advancement in high-energy physics.

Technology Category

Application Category

📝 Abstract
We present a systematic evaluation of representation learning objectives for particle physics within a unified framework. Our study employs a shared transformer-based particle-cloud encoder with standardized preprocessing, matched sampling, and a consistent evaluation protocol on a jet classification dataset. We compare contrastive (supervised and self-supervised), masked particle modeling, and generative reconstruction objectives under a common training regimen. In addition, we introduce targeted supervised architectural modifications that achieve state-of-the-art performance on benchmark evaluations. This controlled comparison isolates the contributions of the learning objective, highlights their respective strengths and limitations, and provides reproducible baselines. We position this work as a reference point for the future development of foundation models in particle physics, enabling more transparent and robust progress across the community.
Problem

Research questions and friction points this paper is trying to address.

Evaluating representation learning objectives for particle physics foundation models
Comparing contrastive, masked modeling, and generative reconstruction methods systematically
Establishing reproducible baselines for future particle physics model development
Innovation

Methods, ideas, or system contributions that make the work stand out.

Transformer-based particle-cloud encoder architecture
Controlled comparison of representation learning objectives
Targeted supervised modifications achieve state-of-the-art performance
🔎 Similar Papers
Michael Chen
Michael Chen
Undergraduate, Carnegie Mellon University
R
Raghav Kansal
Division of Physics, Mathematics and Astronomy, California Institute of Technology, Pasadena, CA 91125; Fermi National Accelerator Laboratory, Batavia, IL 60510, USA
A
Abhijith Gandrakota
Particle Physics Division, Fermi National Accelerator Laboratory, Batavia, IL 60510
Z
Zichun Hao
Division of Physics, Mathematics and Astronomy, California Institute of Technology, Pasadena, CA 91125
Jennifer Ngadiuba
Jennifer Ngadiuba
Wilson Fellow, Fermilab
experimental high-energy physicsdata sciencedeep learningartificial intelligenceFPGAs
M
Maria Spiropulu
Division of Physics, Mathematics and Astronomy, California Institute of Technology, Pasadena, CA 91125