Vendi Information Gain: An Alternative To Mutual Information For Science And Machine Learning

📅 2025-05-13
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Mutual information (MI) suffers from inherent limitations in high-dimensional settings—computational intractability, neglect of sample similarity, and intrinsic symmetry—hindering its applicability in complex, distribution-agnostic scenarios. Method: We propose Vendi Information Gain (VIG), a novel information-theoretic measure that introduces the Vendi score into information theory. VIG defines Vendi entropy and conditional Vendi entropy, operating directly on kernel-based similarity matrices without explicit probabilistic modeling, inherently asymmetry, and distribution-free estimation. Theoretically, VIG strictly generalizes MI; algorithmically, it integrates level-set density estimation with an active acquisition framework. Contribution/Results: Empirically, VIG achieves superior performance across diverse domains: accurately modeling human reaction times in cognitive science, identifying transnational disease hotspots in epidemiology, and consistently outperforming MI in high-dimensional tasks. It underpins the first unified theoretical framework for active data acquisition, enabling principled, structure-aware information maximization.

Technology Category

Application Category

📝 Abstract
In his 1948 seminal paper A Mathematical Theory of Communication that birthed information theory, Claude Shannon introduced mutual information (MI), which he called"rate of transmission", as a way to quantify information gain (IG) and defined it as the difference between the marginal and conditional entropy of a random variable. While MI has become a standard tool in science and engineering, it has several shortcomings. First, MI is often intractable - it requires a density over samples with tractable Shannon entropy - and existing techniques for approximating it often fail, especially in high dimensions. Moreover, in settings where MI is tractable, its symmetry and insensitivity to sample similarity are undesirable. In this paper, we propose the Vendi Information Gain (VIG), a novel alternative to MI that leverages the Vendi scores, a flexible family of similarity-based diversity metrics. We call the logarithm of the VS the Vendi entropy and define VIG as the difference between the marginal and conditional Vendi entropy of a variable. Being based on the VS, VIG accounts for similarity. Furthermore, VIG generalizes MI and recovers it under the assumption that the samples are completely dissimilar. Importantly, VIG only requires samples and not a probability distribution over them. Finally, it is asymmetric, a desideratum for a good measure of IG that MI fails to meet. VIG extends information theory to settings where MI completely fails. For example, we use VIG to describe a novel, unified framework for active data acquisition, a popular paradigm of modern data-driven science. We demonstrate the advantages of VIG over MI in diverse applications, including in cognitive science to model human response times to external stimuli and in epidemiology to learn epidemic processes and identify disease hotspots in different countries via level-set estimation.
Problem

Research questions and friction points this paper is trying to address.

Mutual Information (MI) is intractable and hard to approximate in high dimensions.
MI lacks sensitivity to sample similarity and has undesirable symmetry.
Existing IG measures fail to meet asymmetry and sample-based requirements.
Innovation

Methods, ideas, or system contributions that make the work stand out.

Introduces Vendi Information Gain (VIG) alternative
Leverages similarity-based Vendi scores metrics
Requires samples not probability distributions