An Evaluation of Representation Learning Methods in Particle Physics Foundation Models

📅 2025-11-16

📈 Citations: 0

✨ Influential: 0

career value

249K/year

🤖 AI Summary

Evaluating representation learning paradigms—contrastive learning, masked particle modeling, and generative reconstruction—for foundational models in particle physics remains challenging due to inconsistent architectures, preprocessing, and evaluation protocols. Method: We conduct a systematic, fair comparison under a unified framework featuring a shared Transformer-based particle-cloud encoder, standardized preprocessing, sampling, and evaluation on jet classification tasks. We further propose supervised architectural enhancements to the pretraining objectives. Contribution/Results: Our approach achieves the first state-of-the-art (SOTA) performance on jet classification, demonstrating that supervised fine-tuning significantly boosts representation quality. We comprehensively analyze trade-offs among representational capacity, generalization, and computational efficiency across paradigms. Moreover, we establish the first reproducible, transparent, and robust benchmark for particle physics representation learning—providing critical empirical evidence and methodological guidance for principled foundation model design and advancement in high-energy physics.

Technology Category

Application Category

📝 Abstract

We present a systematic evaluation of representation learning objectives for particle physics within a unified framework. Our study employs a shared transformer-based particle-cloud encoder with standardized preprocessing, matched sampling, and a consistent evaluation protocol on a jet classification dataset. We compare contrastive (supervised and self-supervised), masked particle modeling, and generative reconstruction objectives under a common training regimen. In addition, we introduce targeted supervised architectural modifications that achieve state-of-the-art performance on benchmark evaluations. This controlled comparison isolates the contributions of the learning objective, highlights their respective strengths and limitations, and provides reproducible baselines. We position this work as a reference point for the future development of foundation models in particle physics, enabling more transparent and robust progress across the community.

Problem

Research questions and friction points this paper is trying to address.

Evaluating representation learning objectives for particle physics foundation models

Comparing contrastive, masked modeling, and generative reconstruction methods systematically

Establishing reproducible baselines for future particle physics model development

Innovation

Methods, ideas, or system contributions that make the work stand out.

Transformer-based particle-cloud encoder architecture

Controlled comparison of representation learning objectives

Targeted supervised modifications achieve state-of-the-art performance

🔎 Similar Papers

Large Language Model Enhanced Knowledge Representation Learning: A Survey