Privacy Preserving Reinforcement Learning with One-Sided Feedback

πŸ“… 2026-05-18
πŸ“ˆ Citations: 0
✨ Influential: 0
πŸ“„ PDF

career value

230K/year
πŸ€– AI Summary
This work addresses the challenge of multi-dimensional continuous reinforcement learning under one-sided feedback, where the agent observes only partial state information and receives rewards exclusively on a subset of state-action pairs, while simultaneously needing to ensure both learning efficiency and privacy preservation. The paper proposes the POOL algorithm, which for the first time integrates differential privacy into this setting by modeling and optimizing partially observable feedback. The approach achieves strong privacy guarantees without compromising learning efficiency. Theoretical analysis demonstrates that the algorithm’s sample complexity matches the known lower bound in the non-private setting, thereby overcoming the longstanding trade-off between privacy and utility.
πŸ“ Abstract
We study reinforcement learning (RL) in multi-dimensional continuous state and action spaces with one-sided feedback, where the agent receives partial observations of the state and obtains reward information for only a subset of the state-action space at each time step. This setting introduces substantial challenges in both learning efficiency and privacy preservation. To address these challenges, we propose POOL, a novel privacy-preserving RL algorithm. We conduct a comprehensive theoretical analysis of POOL, deriving a sample complexity bound that matches the known lower bounds for non-private RL. Here, E_rho denotes the privacy parameter, H is the time horizon, and alpha is the optimality-gap parameter. Our findings show that it is possible to enforce strong privacy guarantees while maintaining high learning efficiency, marking a significant step toward practical, privacy-aware RL in multi-dimensional environments with one-sided feedback.
Problem

Research questions and friction points this paper is trying to address.

Privacy Preserving
Reinforcement Learning
One-Sided Feedback
Continuous State Space
Partial Observations
Innovation

Methods, ideas, or system contributions that make the work stand out.

privacy-preserving reinforcement learning
one-sided feedback
sample complexity
continuous state-action spaces
differential privacy
πŸ”Ž Similar Papers
No similar papers found.