BeamClean: Language Aware Embedding Reconstruction

📅 2025-05-19

📈 Citations: 0

✨ Influential: 0

career value

161K/year

🤖 AI Summary

This work investigates reverse-engineering attacks against obfuscated embeddings in the black-box setting: given only obfuscated embedding vectors and a publicly available embedding table—without access to the underlying language model or obfuscation mechanism—the goal is to reconstruct the original token sequence. We propose a language-aware joint estimation framework that, for the first time, unifies language model priors (e.g., n-gram statistics or lightweight LMs) with noise parameter modeling (Laplacian/Gaussian) within a Bayesian inference framework, coupled with beam search decoding for embedding reconstruction. Compared to naive distance-based baselines, our method achieves significantly higher token recovery accuracy. Our results expose a fundamental vulnerability of input-agnostic, fixed-noise obfuscation mechanisms for embedding-level privacy protection, demonstrating their insufficiency against informed adversaries. We thus argue for the necessity of input-dependent, learnable obfuscation strategies that adapt to linguistic context and semantic structure.

Technology Category

Application Category

📝 Abstract

In this work, we consider an inversion attack on the obfuscated input embeddings sent to a language model on a server, where the adversary has no access to the language model or the obfuscation mechanism and sees only the obfuscated embeddings along with the model's embedding table. We propose BeamClean, an inversion attack that jointly estimates the noise parameters and decodes token sequences by integrating a language-model prior. Against Laplacian and Gaussian obfuscation mechanisms, BeamClean always surpasses naive distance-based attacks. This work highlights the necessity for and robustness of more advanced learned, input-dependent methods.

Problem

Research questions and friction points this paper is trying to address.

Inversion attack on obfuscated embeddings without model access

Joint estimation of noise parameters and token decoding

Outperforms naive attacks on Laplacian and Gaussian obfuscation

Innovation

Methods, ideas, or system contributions that make the work stand out.

Jointly estimates noise and decodes tokens

Integrates language-model prior for inversion

Surpasses naive distance-based attacks effectively

🔎 Similar Papers

From Prejudice to Parity: A New Approach to Debiasing Large Language Model Word Embeddings