CoDiEmb: A Collaborative yet Distinct Framework for Unified Representation Learning in Information Retrieval and Semantic Textual Similarity

📅 2025-08-15

📈 Citations: 0

✨ Influential: 0

career value

177K/year

🤖 AI Summary

This paper addresses the negative transfer and performance trade-offs arising from objective misalignment between information retrieval (IR) and semantic textual similarity (STS) tasks. We propose a unified embedding learning framework that is collaborative yet task-separated. Our method introduces: (1) a dynamic cross-task sampling mechanism that decouples IR’s ranking-oriented preferences from STS’s geometric consistency constraints; (2) task-specialized contrastive losses coupled with a delta-guided model fusion strategy to explicitly mitigate gradient conflicts between tasks; and (3) a single-stage end-to-end training pipeline integrating multi-positive mining, hard negative sampling across devices, and ranking-consistency optimization. Evaluated on 15 standard benchmarks, our approach significantly alleviates multi-task interference, yields embeddings with superior geometric properties, and consistently outperforms conventional model ensembling and joint training methods—achieving improved convergence stability and strong generalization.

Technology Category

Application Category

📝 Abstract

Learning unified text embeddings that excel across diverse downstream tasks is a central goal in representation learning, yet negative transfer remains a persistent obstacle. This challenge is particularly pronounced when jointly training a single encoder for Information Retrieval (IR) and Semantic Textual Similarity (STS), two essential but fundamentally disparate tasks for which naive co-training typically yields steep performance trade-offs. We argue that resolving this conflict requires systematically decoupling task-specific learning signals throughout the training pipeline. To this end, we introduce CoDiEmb, a unified framework that reconciles the divergent requirements of IR and STS in a collaborative yet distinct manner. CoDiEmb integrates three key innovations for effective joint optimization: (1) Task-specialized objectives paired with a dynamic sampler that forms single-task batches and balances per-task updates, thereby preventing gradient interference. For IR, we employ a contrastive loss with multiple positives and hard negatives, augmented by cross-device sampling. For STS, we adopt order-aware objectives that directly optimize correlation and ranking consistency. (2) A delta-guided model fusion strategy that computes fine-grained merging weights for checkpoints by analyzing each parameter's deviation from its pre-trained initialization, proving more effective than traditional Model Soups. (3) An efficient, single-stage training pipeline that is simple to implement and converges stably. Extensive experiments on 15 standard IR and STS benchmarks across three base encoders validate CoDiEmb. Our results and analysis demonstrate that the framework not only mitigates cross-task trade-offs but also measurably improves the geometric properties of the embedding space.

Problem

Research questions and friction points this paper is trying to address.

Learning unified text embeddings for diverse tasks without negative transfer

Resolving performance trade-offs between IR and STS tasks

Decoupling task-specific learning signals during training

Innovation

Methods, ideas, or system contributions that make the work stand out.

Dynamic sampler balances single-task batches

Delta-guided fusion optimizes checkpoint merging

Single-stage training ensures stable convergence

🔎 Similar Papers

No similar papers found.