LexSemBridge: Fine-Grained Dense Representation Enhancement through Token-Aware Embedding Augmentation

📅 2025-08-25
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address the semantic-lexical signal mismatch of dense retrieval models in fine-grained retrieval tasks—such as keyword alignment and paragraph localization—within Retrieval-Augmented Generation (RAG), this paper proposes LexSemBridge. Our method introduces token-aware embedding enhancement that integrates three complementary latent vectors—statistical (SLR), learned (LLR), and contextual (CLR)—into dense representations in a plug-and-play manner, without modifying the backbone encoder. Leveraging element-wise interaction and vector modulation, LexSemBridge jointly strengthens discriminative dimensions while preserving semantic directionality. The framework natively supports multimodal extension to both text and vision modalities. Extensive experiments demonstrate significant performance gains across multiple semantic and fine-grained retrieval benchmarks, validating its effectiveness and generalizability. The code and models are publicly available.

Technology Category

Application Category

📝 Abstract
As queries in retrieval-augmented generation (RAG) pipelines powered by large language models (LLMs) become increasingly complex and diverse, dense retrieval models have demonstrated strong performance in semantic matching. Nevertheless, they often struggle with fine-grained retrieval tasks, where precise keyword alignment and span-level localization are required, even in cases with high lexical overlap that would intuitively suggest easier retrieval. To systematically evaluate this limitation, we introduce two targeted tasks, keyword retrieval and part-of-passage retrieval, designed to simulate practical fine-grained scenarios. Motivated by these observations, we propose LexSemBridge, a unified framework that enhances dense query representations through fine-grained, input-aware vector modulation. LexSemBridge constructs latent enhancement vectors from input tokens using three paradigms: Statistical (SLR), Learned (LLR), and Contextual (CLR), and integrates them with dense embeddings via element-wise interaction. Theoretically, we show that this modulation preserves the semantic direction while selectively amplifying discriminative dimensions. LexSemBridge operates as a plug-in without modifying the backbone encoder and naturally extends to both text and vision modalities. Extensive experiments across semantic and fine-grained retrieval tasks validate the effectiveness and generality of our approach. All code and models are publicly available at https://github.com/Jasaxion/LexSemBridge/
Problem

Research questions and friction points this paper is trying to address.

Enhances dense retrieval for fine-grained tasks
Addresses keyword and span-level localization challenges
Improves semantic matching with token-aware embedding augmentation
Innovation

Methods, ideas, or system contributions that make the work stand out.

Token-aware embedding augmentation for dense representations
Element-wise interaction of latent enhancement vectors
Plug-in framework preserving semantic directionality
🔎 Similar Papers
No similar papers found.
Shaoxiong Zhan
Shaoxiong Zhan
Tsinghua University
Natural Language ProcessingLarge Language Model
Hai Lin
Hai Lin
Electrical Engineering, University of Notre Dame
Cyber-Physical SystemsHybrid Dynamical SystemsDistributed Cooperative Systems
Hongming Tan
Hongming Tan
Tsinghua University
X
Xiaodong Cai
Shenzhen International Graduate School, Tsinghua University, Shenzhen, China
H
Hai-Tao Zheng
Shenzhen International Graduate School, Tsinghua University, Shenzhen, China; Pengcheng Laboratory, Shenzhen, China
X
Xin Su
Tencent Company, Shenzhen, China
Zifei Shan
Zifei Shan
Applied Research at Tencent
machine learningnatural language processinglanguage modelsknowledge graphs
R
Ruitong Liu
Shenzhen International Graduate School, Tsinghua University, Shenzhen, China
H
Hong-Gee Kim
Seoul National University, South Korea