🤖 AI Summary
To address model and data leakage caused by eavesdropping attacks in spoofing-signal-assisted private multi-hop split learning, this paper proposes a privacy-preserving framework jointly optimizing model partitioning and device task allocation. Methodologically, we design a Soft Actor-Critic (SAC) reinforcement learning algorithm integrating an Intrinsic Curiosity Module (ICM) and Cross-Attention (CA) mechanism, enabling dynamic role assignment, power control, and collaborative submodel optimization—without requiring prior knowledge of the eavesdropper. Our key contribution is the first incorporation of ICM and CA into privacy-resource co-optimization for multi-hop split learning, significantly enhancing exploration efficiency and decision robustness. Experiments demonstrate that, compared to baseline SAC, our approach achieves up to a 3× faster convergence rate and reduces the eavesdropper’s accessible model information by up to 13%, while satisfying stringent latency and energy consumption constraints.
📝 Abstract
In this paper, deceptive signal-assisted private split learning is investigated. In our model, several edge devices jointly perform collaborative training, and some eavesdroppers aim to collect the model and data information from devices. To prevent the eavesdroppers from collecting model and data information, a subset of devices can transmit deceptive signals. Therefore, it is necessary to determine the subset of devices used for deceptive signal transmission, the subset of model training devices, and the models assigned to each model training device. This problem is formulated as an optimization problem whose goal is to minimize the information leaked to eavesdroppers while meeting the model training energy consumption and delay constraints. To solve this problem, we propose a soft actor-critic deep reinforcement learning framework with intrinsic curiosity module and cross-attention (ICM-CA) that enables a centralized agent to determine the model training devices, the deceptive signal transmission devices, the transmit power, and sub-models assigned to each model training device without knowing the position and monitoring probability of eavesdroppers. The proposed method uses an ICM module to encourage the server to explore novel actions and states and a CA module to determine the importance of each historical state-action pair thus improving training efficiency. Simulation results demonstrate that the proposed method improves the convergence rate by up to 3x and reduces the information leaked to eavesdroppers by up to 13% compared to the traditional SAC algorithm.