JEBS: A Fine-grained Biomedical Lexical Simplification Task

📅 2025-06-15
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
The proliferation of online biomedical literature has exacerbated public comprehension barriers, primarily due to complex domain-specific terminology. Existing simplification corpora lack fine-grained annotations, hindering targeted modeling and rigorous evaluation. To address this, we propose JEBS—a novel, term-level biomedical simplification task—introducing the first fine-grained framework that decomposes simplification into three sequential stages: (1) identification of complex biomedical terms, (2) classification of replacement strategies along semantic, syntactic, or explanatory dimensions, and (3) generation of simplified text. We construct and publicly release the JEBS dataset, comprising 21,595 annotated term replacements across 10,314 unique terms and 400 abstracts. Additionally, we design a multi-stage model integrating rule-based matching with BERT and T5. Extensive experiments establish strong baselines and advance interpretable, evaluable biomedical terminology simplification research.

Technology Category

Application Category

📝 Abstract
Online medical literature has made health information more available than ever, however, the barrier of complex medical jargon prevents the general public from understanding it. Though parallel and comparable corpora for Biomedical Text Simplification have been introduced, these conflate the many syntactic and lexical operations involved in simplification. To enable more targeted development and evaluation, we present a fine-grained lexical simplification task and dataset, Jargon Explanations for Biomedical Simplification (JEBS, https://github.com/bill-from-ri/JEBS-data ). The JEBS task involves identifying complex terms, classifying how to replace them, and generating replacement text. The JEBS dataset contains 21,595 replacements for 10,314 terms across 400 biomedical abstracts and their manually simplified versions. Additionally, we provide baseline results for a variety of rule-based and transformer-based systems for the three sub-tasks. The JEBS task, data, and baseline results pave the way for development and rigorous evaluation of systems for replacing or explaining complex biomedical terms.
Problem

Research questions and friction points this paper is trying to address.

Simplify complex biomedical jargon for public understanding
Develop fine-grained lexical simplification task and dataset
Provide baseline systems for biomedical term replacement
Innovation

Methods, ideas, or system contributions that make the work stand out.

Fine-grained lexical simplification task
Dataset with 21,595 biomedical term replacements
Rule-based and transformer-based baseline systems
🔎 Similar Papers
No similar papers found.