🤖 AI Summary
To address the severe class imbalance in triple confidence distributions within Uncertain Knowledge Graphs (UKGs)—which degrades embedding quality and limits completion performance—this paper proposes a semi-supervised confidence distribution learning framework. Methodologically, it models scalar confidences as distributional representations, leverages meta-learning to generate high-quality pseudo-labels, and jointly optimizes embeddings for both labeled and unlabeled data via iterative semi-supervised training. This approach effectively mitigates confidence distribution skew and substantially enhances model generalization to unseen triples. Experiments on two real-world UKG benchmarks demonstrate that our method significantly outperforms existing state-of-the-art baselines on both triple completion and confidence prediction—achieving more robust and accurate joint completion.
📝 Abstract
Uncertain knowledge graphs (UKGs) associate each triple with a confidence score to provide more precise knowledge representations. Recently, since real-world UKGs suffer from the incompleteness, uncertain knowledge graph (UKG) completion attracts more attention, aiming to complete missing triples and confidences. Current studies attempt to learn UKG embeddings to solve this problem, but they neglect the extremely imbalanced distributions of triple confidences. This causes that the learnt embeddings are insufficient to high-quality UKG completion. Thus, in this paper, to address the above issue, we propose a new semi-supervised Confidence Distribution Learning (ssCDL) method for UKG completion, where each triple confidence is transformed into a confidence distribution to introduce more supervision information of different confidences to reinforce the embedding learning process. ssCDL iteratively learns UKG embedding by relational learning on labeled data (i.e., existing triples with confidences) and unlabeled data with pseudo labels (i.e., unseen triples with the generated confidences), which are predicted by meta-learning to augment the training data and rebalance the distribution of triple confidences. Experiments on two UKG datasets demonstrate that ssCDL consistently outperforms state-of-the-art baselines in different evaluation metrics.