Generative AI for Multiple Choice STEM Assessments

📅 2025-06-02
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Generative AI frequently produces mathematically erroneous yet semantically plausible distractors—termed “valid hallucinations”—in STEM multiple-choice item assessment, compromising measurement validity. Method: This study introduces a mathematics-semantic-guided distractor generation paradigm that systematically transforms AI hallucinations into pedagogical assets. Our approach integrates the Möbius platform architecture, a domain-specific mathematics semantic package, parametric STEM content modeling, mathematics-oriented prompt engineering, and LLM fine-tuning. Contribution/Results: The generated distractors exhibit both mathematical accuracy and cognitive plausibility, significantly enhancing assessment validity as confirmed by expert review and empirical testing. Compared to manual item writing, our method reduces item bank development time and labor costs by over 70%, while preserving academic rigor and psychometric soundness. This work represents the first systematic effort to leverage generative AI hallucinations for educational assessment improvement.

Technology Category

Application Category

📝 Abstract
Artificial intelligence technology enables a range of enhancements in computer-aided instruction, from accelerating the creation of teaching materials to customizing learning paths based on learner outcomes. However, ensuring the mathematical accuracy and semantic integrity of generative AI output remains a significant challenge, particularly in STEM disciplines. In this study, we explore the use of generative AI in which"hallucinations"-- typically viewed as undesirable inaccuracies -- can instead serve a pedagogical purpose. Specifically, we investigate the generation of plausible but incorrect alternatives for multiple choice assessments, where credible distractors are essential for effective assessment design. We describe the M""obius platform for online instruction, with particular focus on its architecture for handling mathematical elements through specialized semantic packages that support dynamic, parameterized STEM content. We examine methods for crafting prompts that interact effectively with these mathematical semantics to guide the AI in generating high-quality multiple choice distractors. Finally, we demonstrate how this approach reduces the time and effort associated with creating robust teaching materials while maintaining academic rigor and assessment validity.
Problem

Research questions and friction points this paper is trying to address.

Ensuring mathematical accuracy in generative AI for STEM
Generating plausible incorrect options for multiple choice assessments
Reducing time to create valid STEM teaching materials
Innovation

Methods, ideas, or system contributions that make the work stand out.

Generative AI creates plausible incorrect STEM answers
Möbius platform handles dynamic parameterized math content
Specialized semantic packages ensure mathematical accuracy
🔎 Similar Papers
No similar papers found.
C
Christina Perdikoulias
Digital Education Company Ltd., Waterloo, N2L 0C7 Canada
C
Chad Vance
Digital Education Company Ltd., Waterloo, N2L 0C7 Canada
Stephen M. Watt
Stephen M. Watt
University of Waterloo
computer algebracompilershandwriting recognition