Using Large Multimodal Models to Extract Knowledge Components for Knowledge Tracing from Multimedia Question Information

📅 2024-09-30

🏛️ arXiv.org

📈 Citations: 0

✨ Influential: 0

career value

163K/year

🤖 AI Summary

Knowledge tracing (KT) relies heavily on manually annotated knowledge components (KCs) and domain-specific statistical rules, hindering adaptation to AI-generated educational content. Method: We propose the first zero-shot and few-shot framework for automatic KC extraction from multimodal questions, leveraging instruction-tuned multimodal large language models (e.g., LLaVA, Qwen-VL). Our approach integrates cross-modal alignment with knowledge graph–guided clustering to yield interpretable and transferable KCs. Contribution/Results: Evaluated across a five-subject KT benchmark, our automatically extracted KCs achieve only a 1.2% AUC drop compared to human-annotated labels. In few-shot settings, our method significantly improves interpretability assessment quality. Crucially, it eliminates dependence on expert annotation and handcrafted domain rules, enabling scalable, automated assessment in low-resource educational settings.

Technology Category

Application Category

📝 Abstract

Knowledge tracing models have enabled a range of intelligent tutoring systems to provide feedback to students. However, existing methods for knowledge tracing in learning sciences are predominantly reliant on statistical data and instructor-defined knowledge components, making it challenging to integrate AI-generated educational content with traditional established methods. We propose a method for automatically extracting knowledge components from educational content using instruction-tuned large multimodal models. We validate this approach by comprehensively evaluating it against knowledge tracing benchmarks in five domains. Our results indicate that the automatically extracted knowledge components can effectively replace human-tagged labels, offering a promising direction for enhancing intelligent tutoring systems in limited-data scenarios, achieving more explainable assessments in educational settings, and laying the groundwork for automated assessment.

Problem

Research questions and friction points this paper is trying to address.

Extracting knowledge components from multimedia educational content automatically

Replacing human-tagged labels with AI-generated knowledge components

Enhancing intelligent tutoring systems in limited-data scenarios

Innovation

Methods, ideas, or system contributions that make the work stand out.

Using multimodal models to extract knowledge components

Automating knowledge tracing with AI-generated content

Replacing human-tagged labels in educational assessments

🔎 Similar Papers

Chain-of-Action: Faithful and Multimodal Question Answering through Large Language Models