Predicting Contextual Informativeness for Vocabulary Learning using Deep Learning

📅 2026-02-20
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This study addresses the challenge of automatically selecting high-informativeness contextual examples for native-language vocabulary instruction among high school students. The authors propose a hybrid approach that integrates deep learning with handcrafted features to predict contextual informativeness. Central to their method is a novel metric, the Retention Competency Curve, which is combined with instruction-aware embeddings—derived from MPNet and Qwen3—and manually engineered contextual features, all modeled through a nonlinear regression head. Experimental results demonstrate that the optimal model achieves a dramatic improvement in corpus quality: while discarding only 30% of high-quality contexts, it elevates the ratio of informative to non-informative contexts to 440:1, substantially enhancing the efficacy of instructional materials.

Technology Category

Application Category

📝 Abstract
We describe a modern deep learning system that automatically identifies informative contextual examples (\qu{contexts}) for first language vocabulary instruction for high school student. Our paper compares three modeling approaches: (i) an unsupervised similarity-based strategy using MPNet's uniformly contextualized embeddings, (ii) a supervised framework built on instruction-aware, fine-tuned Qwen3 embeddings with a nonlinear regression head and (iii) model (ii) plus handcrafted context features. We introduce a novel metric called the Retention Competency Curve to visualize trade-offs between the discarded proportion of good contexts and the \qu{good-to-bad} contexts ratio providing a compact, unified lens on model performance. Model (iii) delivers the most dramatic gains with performance of a good-to-bad ratio of 440 all while only throwing out 70\% of the good contexts. In summary, we demonstrate that a modern embedding model on neural network architecture, when guided by human supervision, results in a low-cost large supply of near-perfect contexts for teaching vocabulary for a variety of target words.
Problem

Research questions and friction points this paper is trying to address.

contextual informativeness
vocabulary learning
context selection
educational NLP
deep learning
Innovation

Methods, ideas, or system contributions that make the work stand out.

contextual informativeness
deep learning
embedding fine-tuning
Retention Competency Curve
vocabulary instruction
🔎 Similar Papers
No similar papers found.