THM@SimpleText 2025 -- Task 1.1: Revisiting Text Simplification based on Complex Terms for Non-Experts

📅 2025-07-06

📈 Citations: 0

✨ Influential: 0

career value

197K/year

🤖 AI Summary

This study addresses the challenge of low comprehensibility of context-poor, ultra-short scientific sentences—such as abstracts and titles—for non-expert readers. We propose a term-level simplification method that first precisely identifies domain-specific complex terms within the input sentence, then employs coordinated lightweight Gemini and OpenAI models for controlled, semantics-preserving rephrasing—avoiding global rewriting to prevent information distortion. Our approach integrates rule-based terminology detection with large language models’ semantic fidelity in generation. Evaluated on CLEF SimpleText 2025 Task 1.1, it significantly improves readability and accuracy for low-context scientific sentences. Experiments demonstrate superiority over baselines across three dimensions: plausibility of term replacement, preservation of original meaning, and comprehension by non-experts. The method thus enhances public accessibility of scientific knowledge without compromising technical integrity.

Technology Category

Application Category

📝 Abstract

Scientific text is complex as it contains technical terms by definition. Simplifying such text for non-domain experts enhances accessibility of innovation and information. Politicians could be enabled to understand new findings on topics on which they intend to pass a law, or family members of seriously ill patients could read about clinical trials. The SimpleText CLEF Lab focuses on exactly this problem of simplification of scientific text. Task 1.1 of the 2025 edition specifically handles the simplification of complex sentences, so very short texts with little context. To tackle this task we investigate the identification of complex terms in sentences which are rephrased using small Gemini and OpenAI large language models for non-expert readers.

Problem

Research questions and friction points this paper is trying to address.

Simplifying scientific text for non-domain experts

Identifying complex terms in short scientific sentences

Rephrasing sentences using LLMs for non-expert readers

Innovation

Methods, ideas, or system contributions that make the work stand out.

Identify complex terms in sentences

Rephrase using Gemini small models

Utilize OpenAI large language models

🔎 Similar Papers

An In-depth Evaluation of Large Language Models in Sentence Simplification with Error-based Human Assessment