CMedTEB & CARE: Benchmarking and Enabling Efficient Chinese Medical Retrieval via Asymmetric Encoders

📅 2026-04-12

📈 Citations: 0

✨ Influential: 0

career value

194K/year

🤖 AI Summary

This study addresses the lack of high-quality benchmarks for Chinese medical text retrieval and the impractical latency of large language model (LLM)-based embedding methods in real-time applications. To this end, the authors introduce CMedTEB, the first Chinese medical embedding benchmark encompassing retrieval, re-ranking, and semantic similarity tasks, with annotations generated collaboratively by multiple LLMs and validated by clinical experts. They further propose CARE, an asymmetric encoding architecture that pairs a lightweight query encoder with a powerful LLM-based offline document encoder, along with a two-stage training strategy to optimize alignment between heterogeneous representations. Experimental results demonstrate that CARE significantly outperforms existing symmetric models on CMedTEB, achieving higher retrieval accuracy while maintaining low inference latency.

Technology Category

Application Category

📝 Abstract

Effective medical text retrieval requires both high accuracy and low latency. While LLM-based embedding models possess powerful retrieval capabilities, their prohibitive latency and high computational cost limit their application in real-time scenarios. Furthermore, the lack of comprehensive and high-fidelity benchmarks hinders progress in Chinese medical text retrieval. In this work, we introduce the Chinese Medical Text Embedding Benchmark (CMedTEB), a benchmark spanning three kinds of practical embedding tasks: retrieval, reranking, and semantic textual similarity (STS). Distinct from purely automated datasets, CMedTEB is curated via a rigorous multi-LLM voting pipeline validated by clinical experts, ensuring gold-standard label quality while effectively mitigating annotation noise. On this foundation, we propose the Chinese Medical Asymmetric REtriever (CARE), an asymmetric architecture that pairs a lightweight BERT-style encoder for online query encoding with a powerful LLM-based encoder for offline document encoding. However, optimizing such an asymmetric retriever with two structurally different encoders presents distinctive challenges. To address this, we introduce a novel two-stage training strategy that progressively bridges the query and document representations. Extensive experiments demonstrate that CARE surpasses state-of-the-art symmetric models on CMedTEB, achieving superior retrieval performance without increasing inference latency.

Problem

Research questions and friction points this paper is trying to address.

Chinese medical retrieval

benchmark

low latency

high-fidelity annotation

real-time retrieval

Innovation

Methods, ideas, or system contributions that make the work stand out.

asymmetric encoders

Chinese medical retrieval

embedding benchmark