Data-Driven, Geometry-Aware Optimal-Transport Calibration of Flavor Tagger

📅 2026-05-02

📈 Citations: 0

✨ Influential: 0

career value

177K/year

🤖 AI Summary

This work addresses the limitations of existing flavor calibration methods, which provide scale factors or one-dimensional corrections only at discrete operating points and thus fail to support continuous, event-level calibration required by modern multi-output calibrators, leading to information loss in high-precision analyses. The authors formulate flavor calibration as an optimal transport problem on the probability simplex, parameterizing the Brenier map in isometric log-ratio coordinates. By integrating normalizing flows with an expectation–maximization algorithm, the method jointly fits multiple regions using control-sample data to learn flavor-conditional target distributions. Innovatively combining Aitchison geometry with optimal transport theory, the approach introduces a linearized feedback operator to disentangle data-constrained from prior-dominated modes, enabling geometry-aware continuous calibration. Closure tests on simulated data demonstrate significantly improved performance both in dedicated control regions and on independent mixed-validation samples.

📝 Abstract

Flavor-tagging calibrations are often provided either as scale factors measured at a finite set of working points or as binned corrections to a chosen one-dimensional discriminant. However, this approach falls short of providing continuous, event-level calibration across the full multicomponent outputs of modern taggers. This limitation leads to information loss in analyses that demand high-performance flavor tagging, restricting analyses to a limited set of predefined variables. In this work, we propose a geometry-aware framework that formulates flavor-tagger calibration as an optimal transport problem on the probability simplex. The transport maps are parameterized and trained in the isometric log-ratio coordinate system. Because the quadratic Euclidean cost of Brenier transport in this coordinate system is equivalent to the Aitchison distance on the simplex, the learned map induces a minimal deformation under the Aitchison geometry. Furthermore, we extract flavor-conditional target distributions directly from control-region data using an expectation-maximization (EM) technique that simultaneously fits multiple control regions, models each flavor component with a normalizing flow, and estimates the regional mixture fractions. The extracted targets are subsequently used to learn flavor-factorized transport maps. Because the joint estimation of mixture fractions and flexible component densities admits weakly constrained directions, we further introduce a linearized feedback-operator analysis that propagates the fitted composition covariance into the extracted component densities, separating data-constrained modes from those dominated by the composition prior. The simulation-based closure study demonstrates improved closure in dedicated control regions and in independent validation mixtures.

Problem

Research questions and friction points this paper is trying to address.

flavor tagging

calibration

optimal transport

probability simplex

event-level

Innovation

Methods, ideas, or system contributions that make the work stand out.

optimal transport

Aitchison geometry

normalizing flow