Conditional Predictive Inference for General Structured Data with Group Symmetries

📅 2026-05-18
📈 Citations: 0
Influential: 0
📄 PDF

career value

203K/year
🤖 AI Summary
Existing predictive inference methods rely on assumptions of independent and identically distributed data or exchangeability, which fail to provide valid conditional coverage guarantees for structured data exhibiting group symmetries—such as networks or hierarchical clusters. This work proposes C-SymmPI, the first framework to achieve approximate conditional coverage for general structured data that are non-exchangeable yet invariant under distributional symmetries. The approach reformulates conditional coverage as miscoverage error over user-specified function classes and integrates relaxed multi-accuracy with invariance theory to devise a projection algorithm suitable for high-dimensional observations and sampling strategies for large or infinite symmetry groups. Experiments demonstrate that C-SymmPI substantially outperforms existing methods on hierarchical and network data, yielding prediction intervals that are more accurate, stable, and informative, while preserving theoretical guarantees under distributional shifts—thereby unifying and extending state-of-the-art results from the exchangeable setting.
📝 Abstract
We study distribution-free predictive inference for data with group symmetries, aiming to establish near-conditional coverage guarantees beyond exchangeability for structured data. While many predictive inference methods achieve a target coverage level, most provide marginal coverage. In practice, conditional predictive inference is often preferred, as it quantifies uncertainty for black-box predictions given observed attributes, thereby accommodating heterogeneity. Although many efforts have pursued efficient conditional coverage, existing methods rely on the i.i.d. or exchangeable assumption, often violated in structured settings such as networks, clusters, and imaging data. Recently, SymmPI introduced a unified approach to predictive inference under group symmetries beyond exchangeability; nevertheless, its guarantees remain marginal and do not account for population heterogeneity. To bridge this gap, we introduce C-SymmPI, a framework that achieves near-conditional coverage under general data structures with group symmetries, extending beyond exchangeability to cover networks, cluster-level data, and related structures. Inspired by relaxed multi-accuracy, our approach reformulates conditional coverage as miscoverage error over a user-specified function class. We establish theoretical guarantees under distributional invariance and distribution shift, and derive convergence rates for linear and RKHS function classes, recovering state-of-the-art results in the exchangeable setting as special cases. For computational efficiency, we develop two variants: a projection-based algorithm for high-dimensional observations, and a sampling-based algorithm for large or infinite groups. We demonstrate effectiveness on hierarchical and network data. Empirical results show that C-SymmPI delivers more informative and stable conditional coverage with improved accuracy compared to existing methods.
Problem

Research questions and friction points this paper is trying to address.

conditional predictive inference
group symmetries
structured data
coverage guarantees
distribution-free inference
Innovation

Methods, ideas, or system contributions that make the work stand out.

conditional predictive inference
group symmetries
near-conditional coverage
distribution-free uncertainty quantification
structured data