🤖 AI Summary
To address privacy and security risks arising from uploading raw smart meter data in power load forecasting, this paper proposes a privacy-preserving collaborative modeling framework based on split learning (SL). In this framework, smart meters perform only the forward pass of the model and upload intermediate activation values—never raw data—while regional gateways (GS) and service providers (SP) jointly complete backward propagation and model training. This design eliminates raw-data leakage. Innovatively, we are the first to integrate SL with differential privacy (DP) for load forecasting; support personalized modeling at GS and global/customized modeling at SP in a synergistic optimization; and theoretically derive and empirically validate information-leakage bounds of activations. Extensive experiments on multiple real-world datasets demonstrate that our method achieves prediction accuracy comparable to or better than centralized training, exhibits strong generalization, and maintains practical utility under DP perturbation—achieving a controllable privacy-utility trade-off.
📝 Abstract
Accurate load forecasting is crucial for energy management, infrastructure planning, and demand-supply balancing. Smart meter data availability has led to the demand for sensor-based load forecasting. Conventional ML allows training a single global model using data from multiple smart meters requiring data transfer to a central server, raising concerns for network requirements, privacy, and security. We propose a split learning-based framework for load forecasting to alleviate this issue. We split a deep neural network model into two parts, one for each Grid Station (GS) responsible for an entire neighbourhood's smart meters and the other for the Service Provider (SP). Instead of sharing their data, client smart meters use their respective GSs' model split for forward pass and only share their activations with the GS. Under this framework, each GS is responsible for training a personalized model split for their respective neighbourhoods, whereas the SP can train a single global or personalized model for each GS. Experiments show that the proposed models match or exceed a centrally trained model's performance and generalize well. Privacy is analyzed by assessing information leakage between data and shared activations of the GS model split. Additionally, differential privacy enhances local data privacy while examining its impact on performance. A transformer model is used as our base learner.