Adaptive Backtracking for Privacy Protection in Large Language Models

📅 2025-08-08

📈 Citations: 0

✨ Influential: 0

career value

168K/year

🤖 AI Summary

This work addresses the long-overlooked enterprise data leakage risk in retrieval-augmented generation (RAG) systems. We propose ABack, the first training-free, enterprise-oriented privacy protection framework. Methodologically, ABack precisely identifies leakage sources via hidden-state modeling and enables adaptive backtracking through a fine-tuning-free output rewriting mechanism. We further introduce PriGenQA—the first enterprise privacy evaluation benchmark for finance and healthcare domains—and propose a strong adversarial assessment paradigm based on group-wise relative policy optimization. Experiments demonstrate that ABack achieves up to a 15% improvement in privacy utility score over baselines while preserving near-original model performance, thereby effectively balancing security and practicality.

Technology Category

Application Category

📝 Abstract

The preservation of privacy has emerged as a critical topic in the era of artificial intelligence. However, current work focuses on user-oriented privacy, overlooking severe enterprise data leakage risks exacerbated by the Retrieval-Augmented Generation paradigm. To address this gap, our paper introduces a novel objective: enterprise-oriented privacy concerns. Achieving this objective requires overcoming two fundamental challenges: existing methods such as data sanitization severely degrade model performance, and the field lacks public datasets for evaluation. We address these challenges with several solutions. (1) To prevent performance degradation, we propose ABack, a training-free mechanism that leverages a Hidden State Model to pinpoint the origin of a leakage intention and rewrite the output safely. (2) To solve the lack of datasets, we construct PriGenQA, a new benchmark for enterprise privacy scenarios in healthcare and finance. To ensure a rigorous evaluation, we move beyond simple static attacks by developing a powerful adaptive attacker with Group Relative Policy Optimization. Experiments show that against this superior adversary, ABack improves the overall privacy utility score by up to 15% over strong baselines, avoiding the performance trade-offs of prior methods.

Problem

Research questions and friction points this paper is trying to address.

Address enterprise data leakage risks in LLMs

Overcome performance degradation from privacy methods

Lack of public datasets for privacy evaluation

Innovation

Methods, ideas, or system contributions that make the work stand out.

ABack: training-free leakage prevention via Hidden State Model

PriGenQA: benchmark for enterprise privacy in healthcare and finance

Group Relative Policy Optimization for adaptive attacker evaluation

🔎 Similar Papers

PrivacyRestore: Privacy-Preserving Inference in Large Language Models via Privacy Removal and Restoration