Embedding Inversion via Conditional Masked Diffusion Language Models

📅 2026-02-11

📈 Citations: 0

✨ Influential: 0

career value

156K/year

🤖 AI Summary

This work addresses the challenge of accurately and efficiently reconstructing original text sequences from given embedding vectors without relying on a target encoder. To this end, the authors propose a novel inversion framework based on conditional masked diffusion, which reformulates the task as a parallel denoising process, departing from conventional autoregressive generation paradigms. The method innovatively integrates masked diffusion mechanisms with conditional normalization, enabling high-fidelity reconstruction in just eight forward passes. Experimental results demonstrate that, on 32-token sequences, the approach achieves up to 81.3% token-level accuracy across three mainstream embedding models, substantially advancing both the efficiency and effectiveness of embedding inversion.

Technology Category

Application Category

📝 Abstract

We frame embedding inversion as conditional masked diffusion, recovering all tokens in parallel through iterative denoising rather than sequential autoregressive generation. A masked diffusion language model is conditioned on the target embedding via adaptive layer normalization, requiring only 8 forward passes through a 78M parameter model with no access to the target encoder. On 32-token sequences across three embedding models, the method achieves up to 81.3% token accuracy. Source code and live demo are available at https://github.com/jina-ai/embedding-inversion-demo.

Problem

Research questions and friction points this paper is trying to address.

embedding inversion

masked diffusion

token recovery

language models

text reconstruction

Innovation

Methods, ideas, or system contributions that make the work stand out.

embedding inversion

conditional masked diffusion

parallel token recovery