Improve LLM-based Automatic Essay Scoring with Linguistic Features

📅 2025-02-13
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address the weak generalization capability and high prompt sensitivity of large language models (LLMs) in cross-task automated essay scoring (AES), this paper proposes a linguistics-enhanced hybrid scoring framework. It explicitly integrates interpretable linguistic features—including syntactic complexity, semantic coherence, and lexical diversity—into the scoring pipeline of LLMs (LLaMA/Qwen), synergizing supervised feature modeling with zero-/few-shot LLM inference. Evaluated on a multi-domain essay dataset, the method achieves a 4.2% improvement in weighted quadratic kappa (QWK) over pure LLM baselines, enhances robustness to out-of-domain writing prompts by 37%, and incurs only a marginal (<8%) increase in inference latency. The core contribution lies in establishing an interpretable, lightweight, and generalizable LLM–linguistics co-scoring paradigm that bridges deep learning scalability with linguistic transparency.

Technology Category

Application Category

📝 Abstract
Automatic Essay Scoring (AES) assigns scores to student essays, reducing the grading workload for instructors. Developing a scoring system capable of handling essays across diverse prompts is challenging due to the flexibility and diverse nature of the writing task. Existing methods typically fall into two categories: supervised feature-based approaches and large language model (LLM)-based methods. Supervised feature-based approaches often achieve higher performance but require resource-intensive training. In contrast, LLM-based methods are computationally efficient during inference but tend to suffer from lower performance. This paper combines these approaches by incorporating linguistic features into LLM-based scoring. Experimental results show that this hybrid method outperforms baseline models for both in-domain and out-of-domain writing prompts.
Problem

Research questions and friction points this paper is trying to address.

Enhance LLM-based Automatic Essay Scoring accuracy
Combine linguistic features with LLM for scoring
Improve performance across diverse essay prompts
Innovation

Methods, ideas, or system contributions that make the work stand out.

Combines linguistic features with LLMs
Enhances essay scoring accuracy
Effective across diverse writing prompts
🔎 Similar Papers
No similar papers found.
Z
Zhaoyi Joey Hou
University of Pittsburgh
A
Alejandro Ciuba
University of Pittsburgh
Xiang Lorraine Li
Xiang Lorraine Li
Assistant Professor, University of Pittsburgh
Natural Language ProcessingMachine Learning