Patch Hierarchical Attention Transformer for Efficient Particle Jet Tagging

📅 2026-05-20

📈 Citations: 0

✨ Influential: 0

career value

169K/year

🤖 AI Summary

This work addresses the challenge of real-time jet tagging in high-luminosity Large Hadron Collider detectors, where stringent latency and accuracy constraints demand efficient identification of short-lived particle decays. Conventional Transformers struggle to meet trigger system resource limitations due to the high computational cost of self-attention. To overcome this, we propose PHAT-JeT, the first model integrating physics-inspired geometric message passing with a hierarchical block-wise self-attention mechanism. This design enables precise attention computation within local particle clusters while preserving global context through lightweight inter-block communication. Operating under strict computational budgets, PHAT-JeT achieves both high accuracy and low latency, substantially surpassing existing efficient Transformers. It sets new state-of-the-art results in classification accuracy and background rejection across four benchmarks—hls4ml, JetClass, Top Tagging, and Quark–Gluon—under resource-constrained conditions.

📝 Abstract

Real-time jet tagging is critical for identifying short-lived particle decays in the high-throughput detectors of the Large Hadron Collider, where real-time trigger systems responsible for deciding which collision events to store impose strict latency and accuracy constraints. While transformer architectures achieve the highest jet tagging accuracy when compute is unconstrained, their quadratic self-attention cost makes inference restrictive on trigger budget. Existing efficient variants reduce the computational cost, but hinder the classification performance. To address this limitation, we introduce the Patch Hierarchical Attention Transformer (PHAT-JeT), which combines two mechanisms: a physics-inspired geometric message-passing module that encodes local detector-plane structure, and a hierarchical patch-based attention scheme that computes exact attention within small particle groups while preserving global context through lightweight patch-token communication. Within a restricted budget, PHAT-JeT achieves state-of-the-art accuracy and background rejection among all resource-constrained jet tagging models on four benchmarks (\textsc{hls4ml}, JetClass, Top Tagging, and Quark--Gluon). Our code is available at https://github.com/aaronw5/PHAT-JeT.

Problem

Research questions and friction points this paper is trying to address.

jet tagging

real-time trigger

transformer efficiency

particle physics

computational constraints

Innovation

Methods, ideas, or system contributions that make the work stand out.

Hierarchical Attention

Patch-based Transformer

Geometric Message Passing