New Evaluation Paradigm for Lexical Simplification

📅 2025-01-25
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Existing lexical simplification (LS) evaluation methods focus solely on individual difficult word substitution, failing to assess sentence-level simplification quality—particularly in contextual modeling and stepwise simplification. This work proposes the first end-to-end, sentence-level LS evaluation paradigm tailored for large language models (LLMs). We design a human-in-the-loop, full-coverage annotation protocol and develop a multi-LLM collaborative framework that explicitly simulates the three-stage LS process—identification, substitution, and reconstruction—thereby overcoming the limitations of single-prompt simplification. Evaluated on a newly constructed benchmark, our method significantly outperforms all baseline approaches. To the best of our knowledge, this is the first systematic, reproducible assessment of LLMs’ holistic sentence simplification capability. The results empirically validate the effectiveness and advancement of the proposed end-to-end evaluation paradigm.

Technology Category

Application Category

📝 Abstract
Lexical Simplification (LS) methods use a three-step pipeline: complex word identification, substitute generation, and substitute ranking, each with separate evaluation datasets. We found large language models (LLMs) can simplify sentences directly with a single prompt, bypassing the traditional pipeline. However, existing LS datasets are not suitable for evaluating these LLM-generated simplified sentences, as they focus on providing substitutes for single complex words without identifying all complex words in a sentence. To address this gap, we propose a new annotation method for constructing an all-in-one LS dataset through human-machine collaboration. Automated methods generate a pool of potential substitutes, which human annotators then assess, suggesting additional alternatives as needed. Additionally, we explore LLM-based methods with single prompts, in-context learning, and chain-of-thought techniques. We introduce a multi-LLMs collaboration approach to simulate each step of the LS task. Experimental results demonstrate that LS based on multi-LLMs approaches significantly outperforms existing baselines.
Problem

Research questions and friction points this paper is trying to address.

Vocabulary Simplification
Sentence-level Assessment
Contextual Learning
Innovation

Methods, ideas, or system contributions that make the work stand out.

Collaborative Human-Machine Approach
Sentence-Level Lexical Simplification
Large Language Model Integration
🔎 Similar Papers
No similar papers found.
Jipeng Qiang
Jipeng Qiang
Yangzhou University
Data miningNLP
M
Minjiang Huang
School of Information and Engineering, Yangzhou University, Jiangsu, China
Y
Yi Zhu
School of Information and Engineering, Yangzhou University, Jiangsu, China
Y
Yunhao Yuan
School of Information and Engineering, Yangzhou University, Jiangsu, China
Chaowei Zhang
Chaowei Zhang
Department of Computer Science at Yangzhou University
Natural Language ProcessingData MiningParallel Computing
X
Xiaoye Ouyang
China Academy of Electronic and Information Technology, Beijing, China