Testing maximum entropy models with e-values

📅 2025-08-31

📈 Citations: 0

✨ Influential: 0

career value

230K/year

🤖 AI Summary

This paper addresses hypothesis testing between maximum entropy models—specifically, comparing microcanonical (hard-constraint) and canonical (soft-constraint) ensembles. Method: We propose and formalize the first optimal e-value construction criterion. For the microcanonical case, we derive a closed-form analytical solution; for the canonical case, we design a statistically robust approximation method that retains asymptotic validity under optional continuation. Our approach integrates e-value theory, maximum entropy modeling, and ensemble analysis within a 2×k contingency table framework, validated via analytical derivation and numerical simulation. Contribution/Results: The method maintains high statistical power even in high-dimensional sparse settings where the number of groups grows with sample size—outperforming conventional p-value-based tests. This work establishes the first e-value testing paradigm for ensemble comparison that simultaneously achieves theoretical optimality and computational feasibility, with direct applications to complex network structure inference and nonstationary time series modeling.

Technology Category

Application Category

📝 Abstract

E-values have recently emerged as a robust and flexible alternative to p-values for hypothesis testing, especially under optional continuation, i.e., when additional data from further experiments are collected. In this work, we define optimal e-values for testing between maximum entropy models, both in the microcanonical (hard constraints) and canonical (soft constraints) settings. We show that, when testing between two hypotheses that are both microcanonical, the so-called growth-rate optimal e-variable admits an exact analytical expression, which also serves as a valid e-variable in the canonical case. For canonical tests, where exact solutions are typically unavailable, we introduce a microcanonical approximation and verify its excellent performance via both theoretical arguments and numerical simulations. We then consider constrained binary models, focusing on $2 imes k$ contingency tables -- an essential framework in statistics and a natural representation for various models of complex systems. Our microcanonical optimal e-variable performs well in both settings, constituting a new tool that remains effective even in the challenging case when the number $k$ of groups grows with the sample size, as in models with growing features used for the analysis of real-world heterogeneous networks and time-series.

Problem

Research questions and friction points this paper is trying to address.

Testing maximum entropy models using e-values

Defining optimal e-values for microcanonical and canonical settings

Evaluating performance on constrained binary models and contingency tables

Innovation

Methods, ideas, or system contributions that make the work stand out.

Optimal e-values for maximum entropy models

Microcanonical approximation for canonical tests

Constrained binary models with contingency tables

🔎 Similar Papers

Conformal e-prediction