Static Factorisation of Probabilistic Programs With User-Labelled Sample Statements and While Loops

📅 2025-08-28

📈 Citations: 0

✨ Influential: 0

career value

199K/year

🤖 AI Summary

This work addresses probabilistic programs with user-annotated sampling statements and while loops (e.g., Gen, Turing, Pyro), where random variables may be generated dynamically. We present the first static factorization method supporting such dynamic variable generation. Our approach extends operational semantics, constructs a probability-annotated control-flow graph, and integrates static dependency analysis with program slicing to yield an exact graphical model of the implicit program density—formally equivalent to a verifiable Bayesian network. The method guarantees theoretical correctness and overcomes the longstanding limitation of traditional graphical models, which require statically fixed variable structures. Empirical evaluation demonstrates that our representation substantially reduces gradient estimation variance and accelerates convergence in single-site Metropolis–Hastings and sequential Monte Carlo inference, achieving performance competitive with or superior to state-of-the-art techniques.

Technology Category

Application Category

📝 Abstract

It is commonly known that any Bayesian network can be implemented as a probabilistic program, but the reverse direction is not so clear. In this work, we address the open question to what extent a probabilistic program with user-labelled sample statements and while loops - features found in languages like Gen, Turing, and Pyro - can be represented graphically. To this end, we extend existing operational semantics to support these language features. By translating a program to its control-flow graph, we define a sound static analysis that approximates the dependency structure of the random variables in the program. As a result, we obtain a static factorisation of the implicitly defined program density, which is equivalent to the known Bayesian network factorisation for programs without loops and constant labels, but constitutes a novel graphical representation for programs that define an unbounded number of random variables via loops or dynamic labels. We further develop a sound program slicing technique to leverage this structure to statically enable three well-known optimisations for the considered program class: we reduce the variance of gradient estimates in variational inference and we speed up both single-site Metropolis Hastings and sequential Monte Carlo. These optimisations are proven correct and empirically shown to match or outperform existing techniques.

Problem

Research questions and friction points this paper is trying to address.

Graphically represent probabilistic programs with loops and labels

Define static factorisation for unbounded random variables

Enable sound optimisations for inference algorithms

Innovation

Methods, ideas, or system contributions that make the work stand out.

Static analysis approximates program dependency structure

Control-flow graph translation enables graphical representation

Program slicing technique enables three optimization methods

🔎 Similar Papers

Programmatic Reinforcement Learning: Navigating Gridworlds