Spanning tree methods for sampling graph partitions

📅 2022-10-04

🏛️ arXiv.org

📈 Citations: 22

✨ Influential: 7

career value

190K/year

🤖 AI Summary

This work addresses the lack of neutral, interpretable benchmarks for detecting partisan gerrymandering in political redistricting. We propose RevReCom—the first reversible Markov chain sampling method for districting with a closed-form stationary distribution. Its stationary probability is proportional to the product of spanning tree counts across districts, inherently encoding contiguity, population balance, and community coherence. Unlike standard ReCom, RevReCom yields a theoretically tractable target distribution, guarantees strict convergence verification, and enables efficient diagnostics (e.g., effective sample size, trace plots). On real-world redistricting instances, it generates high-quality ensembles within one hour. This establishes the first mathematically rigorous yet computationally feasible benchmark framework for statistical fairness testing in redistricting. Moreover, RevReCom serves as a verifiable ground-truth reference for evaluating other graph partitioning samplers.

📝 Abstract

In the last decade, computational approaches to graph partitioning have made a major impact in the analysis of political redistricting, including in U.S. courts of law. Mathematically, a districting plan can be viewed as a balanced partition of a graph into connected subsets. Examining a large sample of valid alternative districting plans can help us recognize gerrymandering against an appropriate neutral baseline. One algorithm that is widely used to produce random samples of districting plans is a Markov chain called recombination (or ReCom ), which repeatedly fuses adjacent districts, forms a spanning tree of their union, and splits that spanning tree with a balanced cut to form new districts. One drawback is that this chain’s stationary distribution has no known closed form when there are three or more districts. In this paper, we modify ReCom slightly to give it a property called reversibility, resulting in a new Markov chain, RevReCom . This new chain converges to the simple, natural distribution that ReCom was originally designed to approximate: a plan’s stationary probability is proportional to the product of the number of spanning trees of each district. This spanning tree score is a measure of district “compactness” (or shape) that is also aligned with notions of community structure from network science. After deriving the steady state formally, we present diagnostic evidence that the convergence is eﬃcient enough for the method to be practically useful, giving high-quality samples for full-sized problems within an hour. In addition to the primary application of benchmarking of redistricting plans (i.e., describing a normal range for statistics), this chain can also be used to validate other methods that target the spanning tree distribution.

Problem

Research questions and friction points this paper is trying to address.

Develops new sampling methods for political districting validation

Establishes spanning tree distribution as principled comparison baseline

Provides powerful null model to detect gerrymandering scientifically

Innovation

Methods, ideas, or system contributions that make the work stand out.

Reversible recombination algorithm for sampling

Spanning tree distribution for validation

Efficient open-source implementation for datasets

🔎 Similar Papers

Graph sub-sampling for divide-and-conquer algorithms in large networks

2024-09-11arXiv.orgCitations: 0

SRI International

US-CA-Menlo Park

Research Scientist