🤖 AI Summary
Jet reconstruction in high-energy collider experiments—jointly inferring both the implicit binary-tree topology and physical parameters (e.g., energy, momentum, particle type)—poses a fundamental challenge due to the combinatorially explosive topology space. This paper introduces the first fully Bayesian method that integrates combinatorial sequential Monte Carlo (CSMC) with a variational pseudo-marginal framework. Our approach overcomes the computational intractability of conventional Bayesian inference over super-exponentially growing topologies, enabling end-to-end joint inference of structure and parameters. By leveraging Bayesian generative modeling and pseudo-marginal approximation, it significantly improves both inference efficiency and robustness. Evaluated on simulated collider data, our method outperforms state-of-the-art approaches across multiple metrics—including reconstruction accuracy, topology identification rate, and inference speed—demonstrating superior scalability and statistical fidelity in complex latent-structure inference.
📝 Abstract
Reconstructing jets, which provide vital insights into the properties and histories of subatomic particles produced in high-energy collisions, is a main problem in data analyses in collider physics. This intricate task deals with estimating the latent structure of a jet (binary tree) and involves parameters such as particle energy, momentum, and types. While Bayesian methods offer a natural approach for handling uncertainty and leveraging prior knowledge, they face significant challenges due to the super-exponential growth of potential jet topologies as the number of observed particles increases. To address this, we introduce a Combinatorial Sequential Monte Carlo approach for inferring jet latent structures. As a second contribution, we leverage the resulting estimator to develop a variational inference algorithm for parameter learning. Building on this, we introduce a variational family using a pseudo-marginal framework for a fully Bayesian treatment of all variables, unifying the generative model with the inference process. We illustrate our method's effectiveness through experiments using data generated with a collider physics generative model, highlighting superior speed and accuracy across a range of tasks.