Just Read the Question: Enabling Generalization to New Assessment Items with Text Awareness

📅 2025-07-10

📈 Citations: 0

✨ Influential: 0

career value

181K/year

🤖 AI Summary

Existing educational assessment models rely heavily on historical response data and struggle to generalize to unseen items. To address this limitation, we propose Text-LENS—the first assessment model that integrates item text embeddings into a variational autoencoder framework. By jointly modeling student proficiency and item semantic features, Text-LENS enables accurate prediction of student performance on novel items—even in the absence of prior response records. Experiments on the Eedi and LLM-Sim datasets demonstrate that Text-LENS matches the performance of the baseline LENS on seen items, while substantially outperforming state-of-the-art methods under diverse unseen-item settings—including zero-shot and cross-domain scenarios. This marks the first demonstration of robust generalization capability in text-based assessment models, achieved through deep semantic understanding of item content. Our work establishes a new paradigm for dynamic item bank construction and adaptive assessment, enabling scalable, personalized evaluation without reliance on extensive historical interaction data.

Technology Category

Application Category

📝 Abstract

Machine learning has been proposed as a way to improve educational assessment by making fine-grained predictions about student performance and learning relationships between items. One challenge with many machine learning approaches is incorporating new items, as these approaches rely heavily on historical data. We develop Text-LENS by extending the LENS partial variational auto-encoder for educational assessment to leverage item text embeddings, and explore the impact on predictive performance and generalization to previously unseen items. We examine performance on two datasets: Eedi, a publicly available dataset that includes item content, and LLM-Sim, a novel dataset with test items produced by an LLM. We find that Text-LENS matches LENS' performance on seen items and improves upon it in a variety of conditions involving unseen items; it effectively learns student proficiency from and makes predictions about student performance on new items.

Problem

Research questions and friction points this paper is trying to address.

Improving generalization to new assessment items using text awareness

Leveraging item text embeddings for better predictive performance

Enhancing student proficiency prediction on unseen items with Text-LENS

Innovation

Methods, ideas, or system contributions that make the work stand out.

Extends LENS with text embeddings

Improves prediction on unseen items

Uses LLM-generated test items

🔎 Similar Papers

Enhancing textual textbook question answering with large language models and retrieval augmented generation