Improving Numerical Stability of Normalized Mutual Information Estimator on High Dimensions

📅 2024-10-10

🏛️ arXiv.org

📈 Citations: 0

✨ Influential: 0

career value

210K/year

🤖 AI Summary

In high-dimensional settings (hundreds of dimensions), explicit computation of the k-nearest neighbor (k-NN) radius in the joint space—required for k-NN–based estimation of normalized mutual information—suffers from numerical overflow due to exponential amplification of Euclidean distances. This work proposes the first logarithmic-space transformation method specifically designed for k-NN radius computation: it relocates the distance exponentiation operation into the log domain, thereby eliminating overflow at the numerical analysis level. The method introduces no approximation or dimensionality reduction; we theoretically prove that its estimation bias matches and remains as controllable as that of standard k-NN estimators. Empirically, it completely avoids overflow on hundreds-of-dimension real and synthetic datasets while preserving mutual information estimation accuracy. This contribution establishes a stable, accurate, and plug-and-play numerical foundation for quantifying statistical dependence in high-dimensional spaces.

Technology Category

Application Category

📝 Abstract

Mutual information provides a powerful, general-purpose metric for quantifying the amount of shared information between variables. Estimating normalized mutual information using a k-Nearest Neighbor (k-NN) based approach involves the calculation of the scaling-invariant k-NN radius. Calculation of the radius suffers from numerical overflow when the joint dimensionality of the data becomes high, typically in the range of several hundred dimensions. To address this issue, we propose a logarithmic transformation technique that improves the numerical stability of the radius calculation in high-dimensional spaces. By applying the proposed transformation during the calculation of the radius, numerical overflow is avoided, and precision is maintained. Proposed transformation is validated through both theoretical analysis and empirical evaluation, demonstrating its ability to stabilize the calculation without compromizing the precision of the results.

Problem

Research questions and friction points this paper is trying to address.

Prevent numerical overflow in high-dimensional k-NN radius calculation

Enhance stability of normalized mutual information estimators

Maintain precision in mutual information estimation without computational overhead

Innovation

Methods, ideas, or system contributions that make the work stand out.

Logarithmic transformation for numerical stability

Avoids overflow in high-dimensional radius calculation

Maintains precision without computational overhead

🔎 Similar Papers

No similar papers found.