Spatially Aware Linear Transformer (SAL-T) for Particle Jet Tagging

📅 2025-10-24
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Standard Transformers incur high inference latency and excessive memory consumption in high-energy physics jet classification due to their quadratic computational complexity. Method: We propose a physics-inspired linear-attention model that incorporates a spatially aware partitioning mechanism grounded in particle kinematic features, integrates lightweight convolutional layers to capture local correlations, and explicitly embeds jet physics priors into the network architecture. Contribution/Results: On jet classification benchmarks, our model matches the accuracy of full-attention Transformers while substantially outperforming Linformer. It reduces GPU memory usage by 42% and inference latency by 58%. Cross-domain evaluation on ModelNet10 further demonstrates strong generalization capability, confirming the effectiveness of physics-informed architectural design beyond domain-specific tasks.

Technology Category

Application Category

📝 Abstract
Transformers are very effective in capturing both global and local correlations within high-energy particle collisions, but they present deployment challenges in high-data-throughput environments, such as the CERN LHC. The quadratic complexity of transformer models demands substantial resources and increases latency during inference. In order to address these issues, we introduce the Spatially Aware Linear Transformer (SAL-T), a physics-inspired enhancement of the linformer architecture that maintains linear attention. Our method incorporates spatially aware partitioning of particles based on kinematic features, thereby computing attention between regions of physical significance. Additionally, we employ convolutional layers to capture local correlations, informed by insights from jet physics. In addition to outperforming the standard linformer in jet classification tasks, SAL-T also achieves classification results comparable to full-attention transformers, while using considerably fewer resources with lower latency during inference. Experiments on a generic point cloud classification dataset (ModelNet10) further confirm this trend. Our code is available at https://github.com/aaronw5/SAL-T4HEP.
Problem

Research questions and friction points this paper is trying to address.

Reducing transformer complexity for particle jet tagging
Addressing quadratic resource demands in high-energy physics
Maintaining accuracy while lowering inference latency
Innovation

Methods, ideas, or system contributions that make the work stand out.

Linear attention transformer for particle jet tagging
Spatially aware partitioning based on kinematic features
Convolutional layers capture local correlations in jets
🔎 Similar Papers
No similar papers found.
A
Aaron Wang
University of Illinois Chicago, Chicago, IL 60607
Zihan Zhao
Zihan Zhao
Shanghai Jiao Tong University
NLP
S
Subash Katel
University of California San Diego, La Jolla, CA 92093
V
Vivekanand Gyanchand Sahu
University of California San Diego, La Jolla, CA 92093
Elham E Khoda
Elham E Khoda
University of California San Diego, La Jolla, CA 92093
A
Abhijith Gandrakota
Fermi National Accelerator Laboratory, Batavia, IL 60510
Jennifer Ngadiuba
Jennifer Ngadiuba
Wilson Fellow, Fermilab
experimental high-energy physicsdata sciencedeep learningartificial intelligenceFPGAs
Richard Cavanaugh
Richard Cavanaugh
University of Illinois Chicago, Chicago, IL 60607
J
Javier Duarte
University of California San Diego, La Jolla, CA 92093