🤖 AI Summary
This work addresses the challenge of achieving both accuracy and robustness in online differentially private linear query release under a fixed global privacy budget. The authors propose a learning-augmented privacy mechanism that answers queries accurately predicted by an external model using the offline optimal matrix mechanism, while dynamically allocating the remaining privacy budget to handle unpredicted queries in an online fashion. The key innovation is a Smooth Allocation strategy that leverages unbiased estimates from early unpredicated queries to adaptively adjust privacy spending: when predictions are accurate, the method approaches the utility of the offline optimal mechanism; when predictions fail, it gracefully degrades to a robust baseline. Experiments on two real-world datasets demonstrate near-offline performance under high prediction overlap and smooth degradation under low overlap.
📝 Abstract
Modern database workloads are highly predictable: query streams are dominated by recurring jobs and templates, even when their arrival order is not known in advance. This motivates a learning-augmented view of online differentially private (DP) analytics: can algorithms utilize predictions about which queries will occur to improve utility under a single global privacy budget, while remaining robust when predictions are wrong? We study online DP query answering, where a curator must answer a stream $Q$ of $S$ linear queries arriving in uniformly random order under privacy budget $(ε,δ)$. We present LAPRAS, which assumes access to an oracle that outputs a prediction set of queries likely to appear in the stream and uses it to guide privacy spending. LAPRAS answers predicted queries using the offline-optimal Matrix Mechanism and answers the remaining queries online from a residual budget. To pace spending across an unknown number of unpredicted queries, we introduce Smooth Allocation, which forms an unbiased stopping-time estimate $\widehat{B}$ from the first $T=Θ(\log^2 S)$ unpredicted queries and continuously recalibrates per-query expenditure. Empirically, over two real datasets, we validate the intended consistency--robustness trade-off: LAPRAS achieves near-offline utility under high overlap and degrades gracefully to baseline-level performance when overlap is low.