Mini-Batch Class Composition Bias in Link Prediction

📅 2026-04-28

📈 Citations: 0

✨ Influential: 0

career value

177K/year

🤖 AI Summary

This work addresses a critical yet overlooked issue in graph neural networks (GNNs): during link prediction training, mini-batch sampling can introduce class composition bias, which—particularly when combined with batch normalization—leads models to learn spurious heuristics. Consequently, the learned representations become misaligned with those beneficial for node classification, challenging the common assumption that GNNs produce task-agnostic, transferable embeddings. The study systematically uncovers this bias for the first time and proposes a correction mechanism that aligns model representations with the intrinsic structural properties of the graph. Experimental results demonstrate that the corrected models focus more on class-relevant features, suggesting that standard training protocols may substantially overestimate the generalization capability of link predictors.

📝 Abstract

Prior work on node classification has shown that Graph Neural Networks (GNNs) can learn representations that transfer across graphs, when underlying graph properties are shared. For a fixed graph, one would then expect GNNs trained for link prediction to learn a representation consistent with that learnt for node classification. We show this intuition does not hold in the general case. Instead, we find popular link prediction models can learn a trivial mini-batch dependent heuristic, enabled by batch-normalisation layers, to solve the edge classification task. When correcting for this, we observe increased alignment of the network representation with node-class relevant features, suggesting the network has learnt a graph representation that better aligns with the underlying graph's properties. Our findings suggest that standard link prediction training may be leading us to overestimate link predictors' ability to learn a generalised representation of a graph that is consistent across tasks.

Problem

Research questions and friction points this paper is trying to address.

link prediction

mini-batch bias

graph representation

graph neural networks

batch normalization

Innovation

Methods, ideas, or system contributions that make the work stand out.

mini-batch bias

link prediction

graph neural networks