🤖 AI Summary
This paper addresses the incompatibility between the random-variable perspective in probability theory and the Markov category framework. To resolve this, we introduce a construction of an abstract sample space category based on Simpson’s axioms, systematically deriving—within any suitable Markov category—a sample space category that is both structurally complete and semantically expressive. Our key contributions are threefold: (i) the first deep integration of Simpson’s probability monad with Markov categories; (ii) a categorical definition of abstract conditional independence; and (iii) the first probabilistic semantic compositional framework taking random variables as primitives. This framework uniformly reconstructs diverse applications—including probabilistic databases and nominal sets—and recovers and generalizes all known Simpson models. It establishes a foundational categorical basis for probabilistic programming, uncertainty reasoning, and structured semantic modeling.
📝 Abstract
Two high-level"pictures"of probability theory have emerged: one that takes as central the notion of random variable, and one that focuses on distributions and probability channels (Markov kernels). While the channel-based picture has been successfully axiomatized, and widely generalized, using the notion of Markov category, the categorical semantics of the random variable picture remain less clear. Simpson's probability sheaves are a recent approach, in which probabilistic concepts like random variables are allowed vary over a site of sample spaces. Simpson has identified rich structure on these sites, most notably an abstract notion of conditional independence, and given examples ranging from probability over databases to nominal sets. We aim bring this development together with the generality and abstraction of Markov categories: We show that for any suitable Markov category, a category of sample spaces can be defined which satisfies Simpson's axioms, and that a theory of probability sheaves can be developed purely synthetically in this setting. We recover Simpson's examples in a uniform fashion from well-known Markov categories, and consider further generalizations.