Retrieval Backward Attention without Additional Training: Enhance Embeddings of Large Language Models via Repetition

📅 2025-02-28
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address the insufficient quality and weak discriminability of pretrained language model (PLM) text embeddings in zero-shot settings, this paper proposes a fine-tuning-free Reverse Attention (RA) mechanism. RA leverages self-attention reconstruction and gradient backpropagation through the embedding layer to iteratively resample and reweight salient tokens, thereby enhancing contextual encoding fidelity in the embedding space. Crucially, RA requires no additional training—only standard forward inference—and thus preserves model efficiency and parameter integrity. Evaluated on the C-MTEB benchmark, RA consistently improves zero-shot performance across retrieval, classification, and other tasks, yielding average gains of 3.2%–5.7% over baseline embeddings. It significantly outperforms existing unsupervised embedding enhancement methods, establishing a new paradigm for efficient, lightweight zero-shot semantic representation.

Technology Category

Application Category

📝 Abstract
Language models can be viewed as functions that embed text into Euclidean space, where the quality of the embedding vectors directly determines model performance, training such neural networks involves various uncertainties. This paper focuses on improving the performance of pre-trained language models in zero-shot settings through a simple and easily implementable method. We propose a novel backward attention mechanism to enhance contextual information encoding. Evaluated on the Chinese Massive Text Embedding Benchmark (C-MTEB), our approach achieves significant improvements across multiple tasks, providing valuable insights for advancing zero-shot learning capabilities.
Problem

Research questions and friction points this paper is trying to address.

Enhance embeddings of pre-trained language models
Improve zero-shot learning performance without additional training
Propose backward attention mechanism for better contextual encoding
Innovation

Methods, ideas, or system contributions that make the work stand out.

Backward attention mechanism enhances embeddings
Improves zero-shot learning without additional training
Evaluated on Chinese Massive Text Embedding Benchmark
🔎 Similar Papers
2024-02-23International Conference on Learning RepresentationsCitations: 42
Y
Yifei Duan
School of Mathematical Sciences, Laboratory of Mathematics and Complex Systems, MOE, Beijing Normal University, Beijing, 100875, China
R
Raphael Shang
Beijing Waiyan Online Digital Technology Co., Ltd, Beijing, China
D
Deng Liang
Beijing Waiyan Online Digital Technology Co., Ltd, Beijing, China
Yongqiang Cai
Yongqiang Cai
Beijing Normal University
machine learningpolymernumerical method