🤖 AI Summary
To address information access inequity faced by minority groups in online social networks due to structural position disadvantages, this paper proposes an edge-augmentation-based fairness optimization method. We introduce resistance distance as the first metric to quantify cross-group information accessibility, jointly capturing global topology and multi-path connectivity. We design the first linear-time approximation algorithm for NP-hard edge-augmentation optimization, overcoming its computational intractability. Our approach integrates spectral graph theory, random-walk-based approximation, and edge-importance sampling. Extensive experiments on million-node real-world and synthetic networks demonstrate that our method achieves accuracy comparable to cubic-time greedy algorithms while accelerating runtime by over three orders of magnitude. Crucially, it significantly improves inter-group fairness in information diffusion—measured via equitable reachability across demographic groups—without compromising overall network efficiency.
📝 Abstract
The advent of online social networks has facilitated fast and wide spread of information. However, some users, especially members of minority groups, may be less likely to receive information spreading on the network, due to their disadvantaged network position. We study the optimization problem of adding new connections to a network to enhance fairness in information access among different demographic groups.
We provide a concrete formulation of this problem where information access is measured in terms of resistance distance, {offering a new perspective that emphasizes global network structure and multi-path connectivity.} The problem is shown to be NP-hard. We propose a simple greedy algorithm which turns out to output accurate solutions, but its run time is cubic, which makes it undesirable for large networks. As our main technical contribution, we reduce its time complexity to linear, leveraging several novel approximation techniques. In addition to our theoretical findings, we also conduct an extensive set of experiments using both real-world and synthetic datasets. We demonstrate that our linear-time algorithm can produce accurate solutions for networks with millions of nodes.