Differential Privacy in Kernelized Contextual Bandits via Random Projections

📅 2025-07-17

📈 Citations: 0

✨ Influential: 0

career value

247K/year

🤖 AI Summary

This paper studies the differentially private kernelized contextual bandit problem under stochastic contexts, aiming to protect the privacy of both context vectors and reward sequences. We propose a novel private kernel ridge regression estimator that jointly leverages private random projections and private covariance estimation to reduce sensitivity, thereby achieving strong privacy guarantees without sacrificing prediction accuracy. Our method is unified across both the centralized and local differential privacy models. Notably, it attains the first optimal cumulative regret bounds in both settings: $widetilde{mathcal{O}}ig(sqrt{gamma_T T} + frac{gamma_T}{varepsilon_{mathrm{DP}}}ig)$ for centralized DP and $widetilde{mathcal{O}}ig(sqrt{gamma_T T} + frac{gamma_T sqrt{T}}{varepsilon_{mathrm{DP}}}ig)$ for local DP, where $gamma_T$ denotes the information gain in the associated reproducing kernel Hilbert space. The key innovation lies in the deep coupling between the privacy mechanism and the intrinsic geometry of the kernel function—enabling significantly improved privacy-utility trade-offs over prior approaches.

Technology Category

Application Category

📝 Abstract

We consider the problem of contextual kernel bandits with stochastic contexts, where the underlying reward function belongs to a known Reproducing Kernel Hilbert Space. We study this problem under an additional constraint of Differential Privacy, where the agent needs to ensure that the sequence of query points is differentially private with respect to both the sequence of contexts and rewards. We propose a novel algorithm that achieves the state-of-the-art cumulative regret of $widetilde{mathcal{O}}(sqrt{γ_TT}+frac{γ_T}{varepsilon_{mathrm{DP}}})$ and $widetilde{mathcal{O}}(sqrt{γ_TT}+frac{γ_Tsqrt{T}}{varepsilon_{mathrm{DP}}})$ over a time horizon of $T$ in the joint and local models of differential privacy, respectively, where $γ_T$ is the effective dimension of the kernel and $varepsilon_{mathrm{DP}} > 0$ is the privacy parameter. The key ingredient of the proposed algorithm is a novel private kernel-ridge regression estimator which is based on a combination of private covariance estimation and private random projections. It offers a significantly reduced sensitivity compared to its classical counterpart while maintaining a high prediction accuracy, allowing our algorithm to achieve the state-of-the-art performance guarantees.

Problem

Research questions and friction points this paper is trying to address.

Achieving differential privacy in kernelized contextual bandits

Balancing privacy and regret in stochastic contextual settings

Developing private kernel-ridge regression for reduced sensitivity

Innovation

Methods, ideas, or system contributions that make the work stand out.

Private kernel-ridge regression via random projections

Differential privacy in contextual kernel bandits

Combines private covariance estimation with projections

🔎 Similar Papers

Revisiting Privacy-Utility Trade-off for DP Training with Pre-existing Knowledge