π€ AI Summary
Existing backbone extraction methods for weighted networks rely on manual parameter tuning or strong topological constraints (e.g., spanning trees), limiting their generalizability to network sparsification. This paper proposes the first fully nonparametric backbone inference framework, grounded in the Minimum Description Length (MDL) principle and integrating microcanonical/canonical Bayesian modeling with greedy optimization to automatically identify the most informative subset of edges. Key contributions include: (1) a dual-scale (globalβlocal) objective for edge evaluation; (2) elimination of dependence on significance thresholds and topological priors; and (3) compatibility with arbitrary edge-weight distributions. Extensive experiments on real-world and synthetic networks demonstrate high edge compression ratios while preserving connectivity, weight heterogeneity, and dynamical spreading behavior. The algorithm achieves a time complexity of O(|E| log |E|).
π Abstract
Network backbones provide useful sparse representations of weighted networks by keeping only their most important links, permitting a range of computational speedups and simplifying network visualizations. A key limitation of existing network backboning methods is that they either require the specification of a free parameter (e.g. significance level) that determines the number of edges to keep in the backbone, or impose specific restrictions on the topology of the backbone (e.g. that it is a spanning tree). Here we develop a completely nonparametric framework for inferring the backbone of a weighted network that overcomes these limitations and automatically selects the optimal set of edges to retain using the Minimum Description Length (MDL) principle. We develop objective functions for global and local network backboning which evaluate the importance of an edge in the context of the whole network and individual node neighborhoods respectively and are generalizable to any weight distribution under canonical and microcanonical Bayesian model specifications. We then construct an efficient and provably optimal greedy algorithm to identify the backbone minimizing our objectives for a large class of weight distributions, whose runtime complexity is log-linear in the number of edges. We demonstrate our methods by comparing them with existing methods in a range of tasks on real and synthetic networks, finding that both the global and local backboning methods can preserve network connectivity, weight heterogeneity, and spreading dynamics while removing a substantial fraction of edges.