Spanning tree methods for sampling graph partitions

📅 2022-10-04
🏛️ arXiv.org
📈 Citations: 22
Influential: 7
📄 PDF
🤖 AI Summary
This work addresses the lack of neutral, interpretable benchmarks for detecting partisan gerrymandering in political redistricting. We propose RevReCom—the first reversible Markov chain sampling method for districting with a closed-form stationary distribution. Its stationary probability is proportional to the product of spanning tree counts across districts, inherently encoding contiguity, population balance, and community coherence. Unlike standard ReCom, RevReCom yields a theoretically tractable target distribution, guarantees strict convergence verification, and enables efficient diagnostics (e.g., effective sample size, trace plots). On real-world redistricting instances, it generates high-quality ensembles within one hour. This establishes the first mathematically rigorous yet computationally feasible benchmark framework for statistical fairness testing in redistricting. Moreover, RevReCom serves as a verifiable ground-truth reference for evaluating other graph partitioning samplers.
📝 Abstract
In the last decade, computational approaches to graph partitioning have made a major impact in the analysis of political redistricting, including in U.S. courts of law. Mathematically, a districting plan can be viewed as a balanced partition of a graph into connected subsets. Examining a large sample of valid alternative districting plans can help us recognize gerrymandering against an appropriate neutral baseline. One algorithm that is widely used to produce random samples of districting plans is a Markov chain called recombination (or ReCom ), which repeatedly fuses adjacent districts, forms a spanning tree of their union, and splits that spanning tree with a balanced cut to form new districts. One drawback is that this chain’s stationary distribution has no known closed form when there are three or more districts. In this paper, we modify ReCom slightly to give it a property called reversibility, resulting in a new Markov chain, RevReCom . This new chain converges to the simple, natural distribution that ReCom was originally designed to approximate: a plan’s stationary probability is proportional to the product of the number of spanning trees of each district. This spanning tree score is a measure of district “compactness” (or shape) that is also aligned with notions of community structure from network science. After deriving the steady state formally, we present diagnostic evidence that the convergence is efficient enough for the method to be practically useful, giving high-quality samples for full-sized problems within an hour. In addition to the primary application of benchmarking of redistricting plans (i.e., describing a normal range for statistics), this chain can also be used to validate other methods that target the spanning tree distribution.
Problem

Research questions and friction points this paper is trying to address.

Develops new sampling methods for political districting validation
Establishes spanning tree distribution as principled comparison baseline
Provides powerful null model to detect gerrymandering scientifically
Innovation

Methods, ideas, or system contributions that make the work stand out.

Reversible recombination algorithm for sampling
Spanning tree distribution for validation
Efficient open-source implementation for datasets
🔎 Similar Papers
No similar papers found.
Sarah Cannon
Sarah Cannon
Georgia Institute of Technology
Markov chainssampling algorithmsrandom tilingsprogrammable mattercomputational geometry
M
M. Duchin
Data Science Institute, University of Chicago, Chicago, IL 60615
Dana Randall
Dana Randall
Georgia Institute of Technology
AlgorithmsTheoretical Computer Science
P
Parker Rule
Tisch College of Civic Life, Tufts University, Medford, MA 02155 USA