š¤ AI Summary
This work addresses the stringent requirements of ultra-low latency and limited hardware resources in the High-Luminosity Large Hadron Collider (HL-LHC) Level-1 trigger system by systematically exploring matrix product states (MPS) and tree tensor networks (TTN) as compact, interpretable alternatives for jet tagging. Leveraging the intrinsic compositional structure of jets and incorporating post-training quantization, the proposed models enable efficient deployment on field-programmable gate arrays (FPGAs). Experimental results demonstrate that these tensor network approaches achieve sub-microsecond inference latency while maintaining classification performance comparable to state-of-the-art deep learning models. This represents a significant breakthrough in overcoming the deployment bottlenecks of conventional deep architectures in real-time triggering scenarios, thereby validating the feasibility and advantages of tensor networks for online applications in high-energy physics.
š Abstract
We present a systematic study of Tensor Network (TN) models $\unicode{x2013}$ Matrix Product States (MPS) and Tree Tensor Networks (TTN) $\unicode{x2013}$ for real-time jet tagging in high-energy physics, with a focus on low-latency deployment on Field Programmable Gate Arrays (FPGAs). Motivated by the strict requirements of the HL-LHC Level-1 trigger system, we explore TNs as compact and interpretable alternatives to deep neural networks. Using low-level jet constituent features, our models achieve competitive performance compared to state-of-the-art deep learning classifiers. We investigate post-training quantization to enable hardware-efficient implementations without degrading classification performance or latency. The best-performing models are synthesized to estimate FPGA resource usage, latency, and memory occupancy, demonstrating sub-microsecond latency and supporting the feasibility of online deployment in real-time trigger systems. Overall, this study highlights the potential of TN-based models for fast and resource-efficient inference in low-latency environments.