A Retrieval-Based Approach to Medical Procedure Matching in Romanian

📅 2025-03-26

📈 Citations: 0

✨ Influential: 0

career value

188K/year

🤖 AI Summary

To address classification errors, claim processing delays, and inefficient manual mapping caused by inconsistencies between institutional procedure names and insurance-standard terminology in the Romanian healthcare system, this paper proposes the first retrieval-based terminology matching framework for Romanian-language clinical text. Methodologically, we systematically evaluate and ensemble Romanian-specific (RoBERTa-base-ro), multilingual (mBERT), and domain-adapted biomedical language models (BioBERT-ro) to construct a sentence-embedding-based semantic similarity pipeline. Evaluated on real-world Romanian medical data, our approach achieves 92.4% Top-1 matching accuracy—substantially outperforming edit-distance and generic word-embedding baselines. This work bridges a critical gap in low-resource language medical terminology alignment and empirically demonstrates the essential role of domain-adapted embeddings for Romanian clinical NLP.

Technology Category

Application Category

📝 Abstract

Accurately mapping medical procedure names from healthcare providers to standardized terminology used by insurance companies is a crucial yet complex task. Inconsistencies in naming conventions lead to missclasified procedures, causing administrative inefficiencies and insurance claim problems in private healthcare settings. Many companies still use human resources for manual mapping, while there is a clear opportunity for automation. This paper proposes a retrieval-based architecture leveraging sentence embeddings for medical name matching in the Romanian healthcare system. This challenge is significantly more difficult in underrepresented languages such as Romanian, where existing pretrained language models lack domain-specific adaptation to medical text. We evaluate multiple embedding models, including Romanian, multilingual, and medical-domain-specific representations, to identify the most effective solution for this task. Our findings contribute to the broader field of medical NLP for low-resource languages such as Romanian.

Problem

Research questions and friction points this paper is trying to address.

Mapping medical procedure names to standardized terminology accurately

Reducing administrative inefficiencies in Romanian healthcare insurance claims

Improving automation for medical name matching in low-resource languages

Innovation

Methods, ideas, or system contributions that make the work stand out.

Retrieval-based architecture for medical matching

Leverages sentence embeddings for name mapping

Evaluates multilingual and medical-domain embeddings

🔎 Similar Papers

No similar papers found.