CMedTEB & CARE: Benchmarking and Enabling Efficient Chinese Medical Retrieval via Asymmetric Encoders

📅 2026-04-12
📈 Citations: 0
Influential: 0
📄 PDF

career value

198K/year
🤖 AI Summary
This study addresses the lack of high-quality benchmarks for Chinese medical text retrieval and the impractical latency of large language model (LLM)-based embedding methods in real-time applications. To this end, the authors introduce CMedTEB, the first Chinese medical embedding benchmark encompassing retrieval, re-ranking, and semantic similarity tasks, with annotations generated collaboratively by multiple LLMs and validated by clinical experts. They further propose CARE, an asymmetric encoding architecture that pairs a lightweight query encoder with a powerful LLM-based offline document encoder, along with a two-stage training strategy to optimize alignment between heterogeneous representations. Experimental results demonstrate that CARE significantly outperforms existing symmetric models on CMedTEB, achieving higher retrieval accuracy while maintaining low inference latency.

Technology Category

Application Category

📝 Abstract
Effective medical text retrieval requires both high accuracy and low latency. While LLM-based embedding models possess powerful retrieval capabilities, their prohibitive latency and high computational cost limit their application in real-time scenarios. Furthermore, the lack of comprehensive and high-fidelity benchmarks hinders progress in Chinese medical text retrieval. In this work, we introduce the Chinese Medical Text Embedding Benchmark (CMedTEB), a benchmark spanning three kinds of practical embedding tasks: retrieval, reranking, and semantic textual similarity (STS). Distinct from purely automated datasets, CMedTEB is curated via a rigorous multi-LLM voting pipeline validated by clinical experts, ensuring gold-standard label quality while effectively mitigating annotation noise. On this foundation, we propose the Chinese Medical Asymmetric REtriever (CARE), an asymmetric architecture that pairs a lightweight BERT-style encoder for online query encoding with a powerful LLM-based encoder for offline document encoding. However, optimizing such an asymmetric retriever with two structurally different encoders presents distinctive challenges. To address this, we introduce a novel two-stage training strategy that progressively bridges the query and document representations. Extensive experiments demonstrate that CARE surpasses state-of-the-art symmetric models on CMedTEB, achieving superior retrieval performance without increasing inference latency.
Problem

Research questions and friction points this paper is trying to address.

Chinese medical retrieval
benchmark
low latency
high-fidelity annotation
real-time retrieval
Innovation

Methods, ideas, or system contributions that make the work stand out.

asymmetric encoders
Chinese medical retrieval
embedding benchmark
two-stage training
low-latency retrieval
🔎 Similar Papers
A
Angqing Jiang
University of Science and Technology of China, State Key Laboratory of Cognitive Intelligence, iFlytek Research
Jianlyu Chen
Jianlyu Chen
University of Science and Technology of China
Natural Language ProcessingInformation Retrieval
Z
Zhe Fang
State Key Laboratory of Cognitive Intelligence, iFlytek Research
Y
Yongcan Wang
State Key Laboratory of Cognitive Intelligence, iFlytek Research
Xinpeng Li
Xinpeng Li
THE UNIVERSITY OF TEXAS AT DALLAS
artificial intelligence and social interaction understanding
K
Keyu Ding
HeFei Institute of Technology
D
Defu Lian
University of Science and Technology of China, State Key Laboratory of Cognitive Intelligence