HEP-JEPA: A foundation model for collider physics using joint embedding predictive architecture

📅 2025-02-06

📈 Citations: 0

✨ Influential: 0

career value

220K/year

🤖 AI Summary

This work addresses the fine-grained classification of particle jets—particularly light-quark and gluon jets—in high-energy colliders. We propose the first self-supervised Transformer foundation model tailored for jet physical representation learning. Methodologically, we introduce the Joint Embedding Predictive Architecture (JEPA) to particle physics, abandoning pixel-level reconstruction; instead, the model predicts unseen jet constituent embeddings from contextual ones, enabling physically consistent and data-efficient representation learning. Pretrained on the JetClass dataset (100 million samples), our model achieves state-of-the-art performance across downstream tasks—including jet classification, top-quark tagging, and quark–gluon discrimination—outperforming both conventional approaches and prior physics-informed models. Crucially, it demonstrates significantly improved generalization across diverse jet topologies and experimental conditions.

Technology Category

Application Category

📝 Abstract

We present a transformer architecture-based foundation model for tasks at high-energy particle colliders such as the Large Hadron Collider. We train the model to classify jets using a self-supervised strategy inspired by the Joint Embedding Predictive Architecture. We use the JetClass dataset containing 100M jets of various known particles to pre-train the model with a data-centric approach -- the model uses a fraction of the jet constituents as the context to predict the embeddings of the unseen target constituents. Our pre-trained model fares well with other datasets for standard classification benchmark tasks. We test our model on two additional downstream tasks: top tagging and differentiating light-quark jets from gluon jets. We also evaluate our model with task-specific metrics and baselines and compare it with state-of-the-art models in high-energy physics. Project site: https://hep-jepa.github.io/

Problem

Research questions and friction points this paper is trying to address.

Develops transformer-based model for collider physics

Classifies jets using self-supervised learning strategy

Tests model on top tagging and jet differentiation

Innovation

Methods, ideas, or system contributions that make the work stand out.

Transformer-based foundation model

Self-supervised learning strategy

JetClass dataset pre-training

🔎 Similar Papers

No similar papers found.