Automated Knowledge Concept Annotation and Question Representation Learning for Knowledge Tracing

📅 2024-10-02

🏛️ arXiv.org

📈 Citations: 1

✨ Influential: 1

career value

164K/year

🤖 AI Summary

Existing knowledge tracing (KT) methods heavily rely on labor-intensive, error-prone manual annotation of knowledge concepts (KCs), while neglecting the semantic relationships between questions and KCs. To address this, we propose KCQRL—a novel LLM-driven framework for stepwise, solution-oriented automatic KC annotation, eliminating the need for expert intervention. KCQRL further introduces a contrastive learning mechanism with false-negative elimination to achieve fine-grained semantic alignment among questions, solution steps, and KCs. Compatible with mainstream KT architectures, KCQRL is integrated into 15 representative KT models and evaluated on two large-scale real-world mathematics learning datasets. It consistently delivers significant performance gains across all models, yielding average AUC improvements of 0.8–1.5%. The framework substantially enhances modeling accuracy and cross-scenario generalization capability.

Technology Category

Application Category

📝 Abstract

Knowledge tracing (KT) is a popular approach for modeling students' learning progress over time, which can enable more personalized and adaptive learning. However, existing KT approaches face two major limitations: (1) they rely heavily on expert-defined knowledge concepts (KCs) in questions, which is time-consuming and prone to errors; and (2) KT methods tend to overlook the semantics of both questions and the given KCs. In this work, we address these challenges and present KCQRL, a framework for automated knowledge concept annotation and question representation learning that can improve the effectiveness of any existing KT model. First, we propose an automated KC annotation process using large language models (LLMs), which generates question solutions and then annotates KCs in each solution step of the questions. Second, we introduce a contrastive learning approach to generate semantically rich embeddings for questions and solution steps, aligning them with their associated KCs via a tailored false negative elimination approach. These embeddings can be readily integrated into existing KT models, replacing their randomly initialized embeddings. We demonstrate the effectiveness of KCQRL across 15 KT algorithms on two large real-world Math learning datasets, where we achieve consistent performance improvements.

Problem

Research questions and friction points this paper is trying to address.

Automates knowledge concept annotation using large language models

Enhances question representation learning with contrastive learning

Improves knowledge tracing models by integrating semantic embeddings

Innovation

Methods, ideas, or system contributions that make the work stand out.

Automated KC annotation using large language models

Contrastive learning for semantic question embeddings

Integration of embeddings into existing KT models

🔎 Similar Papers

Using Large Multimodal Models to Extract Knowledge Components for Knowledge Tracing from Multimedia Question Information