Relative Entropy Regularized Reinforcement Learning for Efficient Encrypted Policy Synthesis

📅 2025-06-14
🏛️ IEEE Control Systems Letters
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Privacy-preserving model-based reinforcement learning (RL) faces severe efficiency bottlenecks due to non-linear operations—particularly min/max and logarithmic/exponential functions—that hinder efficient fully homomorphic encryption (FHE) deployment. Method: This paper proposes a cryptographic policy synthesis framework based on relative entropy regularized RL (RERL). We first reveal that RERL value iteration admits a linear, min-free structure, enabling native FHE embedding. Integrating bootstrapping with quantization error modeling, we achieve end-to-end provably secure policy encryption. Contribution/Results: Theoretically, we establish an error propagation model and convergence bound for encrypted policy synthesis. Empirically, our framework significantly improves computational efficiency and practicality of encrypted RL while preserving policy performance. The core innovation lies in deeply coupling RERL’s structural properties with FHE hardware constraints—thereby overcoming the performance limitations imposed by non-linear operations in conventional encrypted RL approaches.

Technology Category

Application Category

📝 Abstract
We propose an efficient encrypted policy synthesis to develop privacy-preserving model-based reinforcement learning. We first demonstrate that the relative-entropy-regularized reinforcement learning framework offers a computationally convenient linear and ``min-free'' structure for value iteration, enabling a direct and efficient integration of fully homomorphic encryption with bootstrapping into policy synthesis. Convergence and error bounds are analyzed as encrypted policy synthesis propagates errors under the presence of encryption-induced errors including quantization and bootstrapping. Theoretical analysis is validated by numerical simulations. Results demonstrate the effectiveness of the RERL framework in integrating FHE for encrypted policy synthesis.
Problem

Research questions and friction points this paper is trying to address.

Develop privacy-preserving reinforcement learning with encryption
Integrate homomorphic encryption into policy synthesis efficiently
Analyze convergence and errors in encrypted policy synthesis
Innovation

Methods, ideas, or system contributions that make the work stand out.

Relative-entropy-regularized RL for linear value iteration
Integrates fully homomorphic encryption with bootstrapping
Analyzes convergence under encryption-induced errors
🔎 Similar Papers
No similar papers found.