🤖 AI Summary
In federated XGBoost, honest-but-curious adversaries can compromise sensitive intermediate data—such as gradient statistics—posing serious privacy risks. Method: This paper proposes a secure federated XGBoost framework supporting both horizontal and vertical federated learning settings. Built upon NVIDIA FLARE, it introduces a pluggable homomorphic encryption (HE) processor interface and pioneers a CUDA-accelerated HE plugin, enabling end-to-end encrypted aggregation and efficient secure computation. Contribution/Results: It is the first work to simultaneously achieve plaintext-level training performance and strong privacy guarantees in federated XGBoost. In vertical settings, the CUDA-HE plugin accelerates computation by up to 30× over state-of-the-art third-party HE solutions, substantially alleviating the computational overhead bottleneck inherent to HE-based federated learning.
📝 Abstract
Federated learning (FL) enables collaborative model training across decentralized datasets. NVIDIA FLARE's Federated XGBoost extends the popular XGBoost algorithm to both vertical and horizontal federated settings, facilitating joint model development without direct data sharing. However, the initial implementation assumed mutual trust over the sharing of intermediate gradient statistics produced by the XGBoost algorithm, leaving potential vulnerabilities to honest-but-curious adversaries. This work introduces"Secure Federated XGBoost", an efficient solution to mitigate these risks. We implement secure federated algorithms for both vertical and horizontal scenarios, addressing diverse data security patterns. To secure the messages, we leverage homomorphic encryption (HE) to protect sensitive information during training. A novel plugin and processor interface seamlessly integrates HE into the Federated XGBoost pipeline, enabling secure aggregation over ciphertexts. We present both CPU-based and CUDA-accelerated HE plugins, demonstrating significant performance gains. Notably, our CUDA-accelerated HE implementation achieves up to 30x speedups in vertical Federated XGBoost compared to existing third-party solutions. By securing critical computation steps and encrypting sensitive assets, Secure Federated XGBoost provides robust data privacy guarantees, reinforcing the fundamental benefits of federated learning while maintaining high performance.