🤖 AI Summary
In knowledge graph completion (KGC), linear output layers suffer from a rank bottleneck when the number of entities vastly exceeds the embedding dimension, leading to limited expressiveness, degraded ranking performance, and distorted score distributions. To address this, we propose KGE-MoS—the first Mixture-of-Specialists output architecture specifically designed for KGC. It employs a gating mechanism to dynamically combine multiple low-rank mapping branches, effectively alleviating the rank constraint without significant parameter overhead. This work pioneers the integration of mixture-of-experts output structures into KGC, balancing enhanced expressivity with computational efficiency. Evaluated on four standard benchmarks—FB15k-237, WN18RR, YAGO3-10, and FB15k—KGE-MoS consistently improves MRR and Hits@k across all datasets while substantially enhancing the probabilistic calibration of predicted scores. These results demonstrate its dual advantages in both ranking accuracy and faithful score distribution modeling.
📝 Abstract
Many Knowledge Graph Completion (KGC) models, despite using powerful encoders, rely on a simple vector-matrix multiplication to score queries against candidate object entities. When the number of entities is larger than the model's embedding dimension, which in practical scenarios is often by several orders of magnitude, we have a linear output layer with a rank bottleneck. Such bottlenecked layers limit model expressivity. We investigate both theoretically and empirically how rank bottlenecks affect KGC models. We find that, by limiting the set of feasible predictions, rank bottlenecks hurt ranking accuracy and the distribution fidelity of scores. Inspired by the language modelling literature, we propose KGE-MoS, a mixture-based output layer to break rank bottlenecks in many KGC models. Our experiments on four datasets show that KGE-MoS improves performance and probabilistic fit of KGC models for a low parameter cost.