🤖 AI Summary
This study addresses the challenge of low comprehensibility of context-poor, ultra-short scientific sentences—such as abstracts and titles—for non-expert readers. We propose a term-level simplification method that first precisely identifies domain-specific complex terms within the input sentence, then employs coordinated lightweight Gemini and OpenAI models for controlled, semantics-preserving rephrasing—avoiding global rewriting to prevent information distortion. Our approach integrates rule-based terminology detection with large language models’ semantic fidelity in generation. Evaluated on CLEF SimpleText 2025 Task 1.1, it significantly improves readability and accuracy for low-context scientific sentences. Experiments demonstrate superiority over baselines across three dimensions: plausibility of term replacement, preservation of original meaning, and comprehension by non-experts. The method thus enhances public accessibility of scientific knowledge without compromising technical integrity.
📝 Abstract
Scientific text is complex as it contains technical terms by definition. Simplifying such text for non-domain experts enhances accessibility of innovation and information. Politicians could be enabled to understand new findings on topics on which they intend to pass a law, or family members of seriously ill patients could read about clinical trials. The SimpleText CLEF Lab focuses on exactly this problem of simplification of scientific text. Task 1.1 of the 2025 edition specifically handles the simplification of complex sentences, so very short texts with little context. To tackle this task we investigate the identification of complex terms in sentences which are rephrased using small Gemini and OpenAI large language models for non-expert readers.