Hard negative sampling in hyperedge prediction

📅 2025-03-11
📈 Citations: 0
Influential: 0
📄 PDF

career value

208K/year
🤖 AI Summary
In hyperedge prediction, random negative sampling often yields overly discriminative negatives, causing premature convergence and degrading generalization. To address this, we propose the first method that synthesizes hard negative hyperedges directly in the hyperedge embedding space—departing from conventional node-combination-based negative construction paradigms. Our approach jointly optimizes embedding-space perturbations and geometric constraints to generate high-informativeness, high-challenge hard negatives. We further design a plug-and-play negative sampling module compatible with diverse embedding-based hyperedge prediction models. Extensive experiments on multiple real-world hypergraph datasets demonstrate an average AUC improvement of 4.2%, significantly enhancing model discriminability and robustness. This work establishes a new direction for learning-informed negative sampling in hypergraph representation learning.

Technology Category

Application Category

📝 Abstract
Hypergraph, which allows each hyperedge to encompass an arbitrary number of nodes, is a powerful tool for modeling multi-entity interactions. Hyperedge prediction is a fundamental task that aims to predict future hyperedges or identify existent but unobserved hyperedges based on those observed. In link prediction for simple graphs, most observed links are treated as positive samples, while all unobserved links are considered as negative samples. However, this full-sampling strategy is impractical for hyperedge prediction, due to the number of unobserved hyperedges in a hypergraph significantly exceeds the number of observed ones. Therefore, one has to utilize some negative sampling methods to generate negative samples, ensuring their quantity is comparable to that of positive samples. In current hyperedge prediction, randomly selecting negative samples is a routine practice. But through experimental analysis, we discover a critical limitation of random selecting that the generated negative samples are too easily distinguishable from positive samples. This leads to premature convergence of the model and reduces the accuracy of prediction. To overcome this issue, we propose a novel method to generate negative samples, named as hard negative sampling (HNS). Unlike traditional methods that construct negative hyperedges by selecting node sets from the original hypergraph, HNS directly synthesizes negative samples in the hyperedge embedding space, thereby generating more challenging and informative negative samples. Our results demonstrate that HNS significantly enhances both accuracy and robustness of the prediction. Moreover, as a plug-and-play technique, HNS can be easily applied in the training of various hyperedge prediction models based on representation learning.
Problem

Research questions and friction points this paper is trying to address.

Hyperedge prediction faces impractical full-sampling due to vast unobserved hyperedges.
Random negative sampling in hyperedge prediction leads to easily distinguishable samples.
Proposed hard negative sampling improves prediction accuracy and robustness significantly.
Innovation

Methods, ideas, or system contributions that make the work stand out.

Hard negative sampling synthesizes challenging hyperedge samples.
Generates negative samples in hyperedge embedding space directly.
Improves prediction accuracy and robustness significantly.
🔎 Similar Papers
No similar papers found.