🤖 AI Summary
Exact goodness-of-fit testing for discrete exponential family models suffers from computational intractability in high-dimensional sparse settings, primarily due to the difficulty of efficiently sampling lattice points uniformly from the constrained fiber. This paper introduces the first reinforcement learning framework for fiber sampling by formulating it as a Markov decision process. We propose an Actor-Critic–based sampling algorithm that integrates lattice geometry, exchangeable sampling mechanisms, and linear-algebraic acceleration techniques, accompanied by theoretical convergence guarantees. Compared with conventional MCMC and nonlinear algebraic approaches, our method substantially reduces computational overhead while generating statistically valid, exchangeable samples in high-dimensional sparse regimes. It thus overcomes a fundamental bottleneck in exact inference for large-scale structured data.
📝 Abstract
We consider the problem of constructing exact goodness-of-fit tests for discrete exponential family models. This classical problem remains practically unsolved for many types of structured or sparse data, as it rests on a computationally difficult core task: to produce a reliable sample from lattice points in a high-dimensional polytope. We translate the problem into a Markov decision process and demonstrate a reinforcement learning approach for learning `good moves' for sampling. We illustrate the approach on data sets and models for which traditional MCMC samplers converge too slowly due to problem size, sparsity structure, and the requirement to use prohibitive non-linear algebra computations in the process. The differentiating factor is the use of scalable tools from emph{linear} algebra in the context of theoretical guarantees provided by emph{non-linear} algebra. Our algorithm is based on an actor-critic sampling scheme, with provable convergence. The discovered moves can be used to efficiently obtain an exchangeable sample, significantly cutting computational times with regards to statistical testing.