Latent Representation Learning in Heavy-Ion Collisions with MaskPoint Transformer

📅 2025-10-08

📈 Citations: 0

✨ Influential: 0

career value

205K/year

🤖 AI Summary

In high-energy nuclear physics, final-state data from heavy-ion collisions (HIC) exhibit high dimensionality and complex structure, rendering conventional approaches—relying on hand-crafted observables—prone to missing nonlinear physical correlations. To address this, we propose a masked point cloud Transformer autoencoder framework trained in two stages: first, self-supervised pretraining to learn compact, information-rich latent representations; second, supervised fine-tuning for downstream tasks. Our method integrates point cloud modeling, self-supervised learning, and interpretability analysis (via SHAP and PCA), enabling both robust discrimination and physical insight. It achieves state-of-the-art performance on collision system size classification, significantly outperforming the PointNet baseline. The learned features demonstrate strong discriminative power while retaining physical interpretability—e.g., aligning with known collective flow patterns and centrality-dependent trends. All code is publicly available.

Technology Category

Application Category

📝 Abstract

A central challenge in high-energy nuclear physics is to extract informative features from the high-dimensional final-state data of heavy-ion collisions (HIC) in order to enable reliable downstream analyses. Traditional approaches often rely on selected observables, which may miss subtle but physically relevant structures in the data. To address this, we introduce a Transformer-based autoencoder trained with a two-stage paradigm: self-supervised pre-training followed by supervised fine-tuning. The pretrained encoder learns latent representations directly from unlabeled HIC data, providing a compact and information-rich feature space that can be adapted to diverse physics tasks. As a case study, we apply the method to distinguish between large and small collision systems, where it achieves significantly higher classification accuracy than PointNet. Principal component analysis and SHAP interpretation further demonstrate that the autoencoder captures complex nonlinear correlations beyond individual observables, yielding features with strong discriminative and explanatory power. These results establish our two-stage framework as a general and robust foundation for feature learning in HIC, opening the door to more powerful analyses of quark--gluon plasma properties and other emergent phenomena. The implementation is publicly available at https://github.com/Giovanni-Sforza/MaskPoint-AMPT.

Problem

Research questions and friction points this paper is trying to address.

Extracting informative features from high-dimensional heavy-ion collision data

Overcoming limitations of traditional observable-based analysis methods

Learning compact latent representations for diverse physics tasks

Innovation

Methods, ideas, or system contributions that make the work stand out.

Transformer autoencoder learns latent collision representations

Two-stage training combines self-supervised and supervised learning

Learned features capture nonlinear correlations beyond traditional observables

🔎 Similar Papers

No similar papers found.