🤖 AI Summary
This work addresses the computational challenges of Markov chain Monte Carlo (MCMC) inference for high-dimensional directed acyclic graphs (DAGs), where the super-exponentially growing search space renders exact inference intractable. While existing constrained-search approaches improve efficiency, they inevitably introduce approximation errors. To mitigate this trade-off, the authors propose a birth-death process–based trans-dimensional MCMC sampler that dynamically expands or contracts the search space, achieving both computational efficiency and substantially reduced approximation error. The study establishes tight total variation distance bounds for constrained-space MCMC methods—a novel theoretical contribution—and introduces a dynamic prior mechanism governed by birth-death rates. Theoretical analysis delineates conditions under which the approximation error becomes negligible, and simulations demonstrate the algorithm’s superior performance with finite samples.
📝 Abstract
Inferring directed acyclic graphs (DAGs) from data via Markov chain Monte Carlo (MCMC) is computationally challenging in moderate-to-high dimensional settings because their discrete sampling space grows super-exponentially with the number of nodes. To address scalability, several recent MCMC-based graph inference methods restrict the search space to a subset of edges, at the cost of introducing error into the inference procedure.
In this work, we derive sharp lower and upper bounds on the total variation distance between the unrestricted posterior distribution and the posterior distribution induced by a state-of-the-art restricted search space MCMC method. These bounds characterize regimes in which the approximation error is negligible and regimes in which it is not. In order to reduce the error, we propose a flexible transdimensional MCMC sampler which allows the search space to expand or contract dynamically as the chain progresses. The sampler is defined by birth-and-death rates that induce a prior distribution on the set of search spaces, rather than assume a fixed restricted search space throughout. We outline an efficient implementation of the proposed algorithm and demonstrate its finite-sample performance through simulation studies.