JetFormer: A Scalable and Efficient Transformer for Jet Tagging from Offline Analysis to FPGA Triggers

📅 2026-01-23
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the discrepancy between high-precision offline jet tagging and ultra-low-latency online trigger deployment in high-energy physics by proposing JetFormer, a general and scalable encoder-only Transformer architecture that directly processes variable-length particle sequences without explicitly modeling inter-particle interactions. Through hardware-aware design, multi-objective hyperparameter optimization, structured pruning, and quantization, lightweight variants such as JetFormer-tiny are tailored to meet FPGA resource constraints. On the JetClass dataset, JetFormer achieves accuracy comparable to ParT with 37.4% fewer FLOPs; on the HLS4ML 150P benchmark, it outperforms MLP and Deep Sets by 3–4% in accuracy while achieving sub-microsecond inference latency on FPGA. This represents the first end-to-end unified Transformer deployment spanning offline analysis to real-time trigger systems.

Technology Category

Application Category

📝 Abstract
We present JetFormer, a versatile and scalable encoder-only Transformer architecture for particle jet tagging at the Large Hadron Collider (LHC). Unlike prior approaches that are often tailored to specific deployment regimes, JetFormer is designed to operate effectively across the full spectrum of jet tagging scenarios, from high-accuracy offline analysis to ultra-low-latency online triggering. The model processes variable-length sets of particle features without relying on input of explicit pairwise interactions, yet achieves competitive or superior performance compared to state-of-the-art methods. On the large-scale JetClass dataset, a large-scale JetFormer matches the accuracy of the interaction-rich ParT model (within 0.7%) while using 37.4% fewer FLOPs, demonstrating its computational efficiency and strong generalization. On benchmark HLS4ML 150P datasets, JetFormer consistently outperforms existing models such as MLPs, Deep Sets, and Interaction Networks by 3-4% in accuracy. To bridge the gap to hardware deployment, we further introduce a hardware-aware optimization pipeline based on multi-objective hyperparameter search, yielding compact variants like JetFormer-tiny suitable for FPGA-based trigger systems with sub-microsecond latency requirements. Through structured pruning and quantization, we show that JetFormer can be aggressively compressed with minimal accuracy loss. By unifying high-performance modeling and deployability within a single architectural framework, JetFormer provides a practical pathway for deploying Transformer-based jet taggers in both offline and online environments at the LHC. Code is available at https://github.com/walkieq/JetFormer.
Problem

Research questions and friction points this paper is trying to address.

jet tagging
Transformer
FPGA triggers
LHC
model deployment
Innovation

Methods, ideas, or system contributions that make the work stand out.

Transformer
Jet Tagging
Hardware-aware Optimization
FPGA Deployment
Efficient Neural Architecture
🔎 Similar Papers
No similar papers found.
R
Ruoqing Zheng
Imperial College London, UK
C
Chang Sun
California Institute of Technology, USA
Q
Qibin Liu
SLAC National Accelerator Laboratory, USA
L
Lauri Laatu
Imperial College London, UK
A
Arianna Cox
Imperial College London, UK
B
B. Maier
Imperial College London, UK
A
Alexander Tapper
Imperial College London, UK
J
J. G. Coutinho
Imperial College London, UK
Wayne Luk
Wayne Luk
Professor of Computer Engineering, Imperial College London
Hardware and ArchitectutreReconfigurable ComputingDesign Automation
Z
Zhiqiang Que
Imperial College London, UK