🤖 AI Summary
Normalized Mutual Information (NMI) suffers from two fundamental biases in clustering and community detection evaluation: (i) it disregards the intrinsic information structure of the contingency table, and (ii) its symmetric normalization induces spurious dependence on algorithmic output distributions. Method: We systematically uncover NMI’s information-theoretic origins and propose Unbiased Mutual Information (UB-MI), the first mutual information measure that simultaneously eliminates both biases. UB-MI reformulates normalization from first principles of information theory, preserving the full statistical semantics of the contingency table. Contribution/Results: On multi-algorithm community detection benchmarks, NMI severely distorts performance rankings—reducing average rank correlation by up to 0.32—whereas UB-MI significantly improves evaluation robustness, interpretability, and alignment with ground-truth structure, increasing consistency by 17.6%. UB-MI establishes a theoretically sounder foundation for unsupervised evaluation.
📝 Abstract
Normalized mutual information is widely used as a similarity measure for evaluating the performance of clustering and classification algorithms. In this paper, we argue that results returned by the normalized mutual information are biased for two reasons: first, because they ignore the information content of the contingency table and, second, because their symmetric normalization introduces spurious dependence on algorithm output. We introduce a modified version of the mutual information that remedies both of these shortcomings. As a practical demonstration of the importance of using an unbiased measure, we perform extensive numerical tests on a basket of popular algorithms for network community detection and show that one’s conclusions about which algorithm is best are significantly affected by the biases in the traditional mutual information.