Uncertainty-driven Embedding Convolution

📅 2025-07-28
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Existing text embedding models exhibit substantial performance degradation across domains, and mainstream ensemble methods neglect model uncertainty, compromising downstream task robustness. To address this, we propose an uncertainty-driven embedding fusion framework: deterministic embeddings are first modeled as probabilistic distributions; then, an adaptive weighting mechanism and an uncertainty-aware similarity function are designed and jointly optimized under the Bayesian optimality criterion. This work is the first to integrate posterior probability transformation, surrogate-loss-guided uncertainty estimation, and an uncertainty-aware similarity scoring system for embedding fusion. Extensive experiments on retrieval, classification, and semantic similarity tasks demonstrate significant improvements in both accuracy and robustness over state-of-the-art baselines. Our results empirically validate that explicit uncertainty modeling plays a critical role in enhancing semantic representation quality.

Technology Category

Application Category

📝 Abstract
Text embeddings are essential components in modern NLP pipelines. While numerous embedding models have been proposed, their performance varies across domains, and no single model consistently excels across all tasks. This variability motivates the use of ensemble techniques to combine complementary strengths. However, most existing ensemble methods operate on deterministic embeddings and fail to account for model-specific uncertainty, limiting their robustness and reliability in downstream applications. To address these limitations, we propose Uncertainty-driven Embedding Convolution (UEC). UEC first transforms deterministic embeddings into probabilistic ones in a post-hoc manner. It then computes adaptive ensemble weights based on embedding uncertainty, grounded in a Bayes-optimal solution under a surrogate loss. Additionally, UEC introduces an uncertainty-aware similarity function that directly incorporates uncertainty into similarity scoring. Extensive experiments on retrieval, classification, and semantic similarity benchmarks demonstrate that UEC consistently improves both performance and robustness by leveraging principled uncertainty modeling.
Problem

Research questions and friction points this paper is trying to address.

Addresses variability in text embedding performance across domains
Improves ensemble methods by incorporating model-specific uncertainty
Enhances robustness in NLP tasks via uncertainty-aware similarity scoring
Innovation

Methods, ideas, or system contributions that make the work stand out.

Transforms deterministic embeddings into probabilistic ones
Computes adaptive ensemble weights using uncertainty
Introduces uncertainty-aware similarity scoring function
🔎 Similar Papers
No similar papers found.