Low-Dimensional Federated Knowledge Graph Embedding via Knowledge Distillation

📅 2024-08-11
🏛️ arXiv.org
📈 Citations: 1
Influential: 0
📄 PDF
🤖 AI Summary
To address the high communication overhead, storage demands, and inference inefficiency caused by high-dimensional embeddings in Federated Knowledge Graph Embedding (FKGE), this paper proposes FedKD, a lightweight knowledge distillation framework. Methodologically, it introduces the first adaptive-temperature distillation mechanism tailored for FKGE, dynamically scaling positive and negative triple scores to mitigate teacher model overconfidence; it further designs a dynamic weighting strategy for the KD loss to prevent communication redundancy across training rounds. Technically, FedKD integrates knowledge distillation, KL-divergence loss, and low-dimensional graph embedding optimization. Extensive experiments on three standard benchmarks demonstrate that FedKD reduces communication volume and model size by up to 72%, while maintaining or even surpassing the link prediction performance of high-dimensional baselines—achieving, for the first time, efficient and lossless model compression in FKGE.

Technology Category

Application Category

📝 Abstract
Federated Knowledge Graph Embedding (FKGE) aims to facilitate collaborative learning of entity and relation embeddings from distributed Knowledge Graphs (KGs) across multiple clients, while preserving data privacy. Training FKGE models with higher dimensions is typically favored due to their potential for achieving superior performance. However, high-dimensional embeddings present significant challenges in terms of storage resource and inference speed. Unlike traditional KG embedding methods, FKGE involves multiple client-server communication rounds, where communication efficiency is critical. Existing embedding compression methods for traditional KGs may not be directly applicable to FKGE as they often require multiple model trainings which potentially incur substantial communication costs. In this paper, we propose a light-weight component based on Knowledge Distillation (KD) which is titled FedKD and tailored specifically for FKGE methods. During client-side local training, FedKD facilitates the low-dimensional student model to mimic the score distribution of triples from the high-dimensional teacher model using KL divergence loss. Unlike traditional KD way, FedKD adaptively learns a temperature to scale the score of positive triples and separately adjusts the scores of corresponding negative triples using a predefined temperature, thereby mitigating teacher over-confidence issue. Furthermore, we dynamically adjust the weight of KD loss to optimize the training process. Extensive experiments on three datasets support the effectiveness of FedKD.
Problem

Research questions and friction points this paper is trying to address.

Compressing federated knowledge graph embeddings efficiently
Reducing communication costs in distributed learning settings
Addressing storage and inference challenges with low-dimensional models
Innovation

Methods, ideas, or system contributions that make the work stand out.

Knowledge distillation for low-dimensional embeddings
Adaptive temperature scaling for confidence mitigation
Dynamic adjustment of KD loss weight
🔎 Similar Papers
No similar papers found.