INFO-SEDD: Continuous Time Markov Chains as Scalable Information Metrics Estimators

📅 2025-02-26

📈 Citations: 0

✨ Influential: 0

career value

226K/year

🤖 AI Summary

To address inaccurate mutual information and entropy estimation for high-dimensional discrete distributions, this paper proposes the first differentiable unified estimation framework based on continuous-time Markov chains (CTMCs). The method directly models discrete state spaces by parameterizing transition rates, enabling end-to-end optimization without distortion from continuous embeddings. It supports both single-model training and plug-and-play integration with pretrained generative models. Theoretical analysis guarantees consistency of information-theoretic quantity estimation, and gradient computation is fully compatible with backpropagation. Experiments on synthetic benchmarks and Ising model entropy estimation demonstrate that our approach significantly outperforms existing embedding-based neural estimators. Moreover, it achieves superior scalability to high dimensions while reducing memory footprint and computational overhead.

Technology Category

Application Category

📝 Abstract

Information-theoretic quantities play a crucial role in understanding non-linear relationships between random variables and are widely used across scientific disciplines. However, estimating these quantities remains an open problem, particularly in the case of high-dimensional discrete distributions. Current approaches typically rely on embedding discrete data into a continuous space and applying neural estimators originally designed for continuous distributions, a process that may not fully capture the discrete nature of the underlying data. We consider Continuous-Time Markov Chains (CTMCs), stochastic processes on discrete state-spaces which have gained popularity due to their generative modeling applications. In this work, we introduce INFO-SEDD, a novel method for estimating information-theoretic quantities of discrete data, including mutual information and entropy. Our approach requires the training of a single parametric model, offering significant computational and memory advantages. Additionally, it seamlessly integrates with pretrained networks, allowing for efficient reuse of pretrained generative models. To evaluate our approach, we construct a challenging synthetic benchmark. Our experiments demonstrate that INFO-SEDD is robust and outperforms neural competitors that rely on embedding techniques. Moreover, we validate our method on a real-world task: estimating the entropy of an Ising model. Overall, INFO-SEDD outperforms competing methods and shows scalability to high-dimensional scenarios, paving the way for new applications where estimating MI between discrete distribution is the focus. The promising results in this complex, high-dimensional scenario highlight INFO-SEDD as a powerful new estimator in the toolkit for information-theoretical analysis.

Problem

Research questions and friction points this paper is trying to address.

Estimating information-theoretic quantities in high-dimensional data

Overcoming limitations of neural estimators for discrete distributions

Scalable entropy and mutual information estimation with CTMCs

Innovation

Methods, ideas, or system contributions that make the work stand out.

Continuous-Time Markov Chains

Single parametric model training

Efficient reuse of pretrained models

🔎 Similar Papers

No similar papers found.