Towards Valid Student Simulation with Large Language Models

📅 2026-01-09

🏛️ arXiv.org

📈 Citations: 0

✨ Influential: 0

career value

195K/year

🤖 AI Summary

This study addresses the “competence paradox” in student simulation, wherein large language models often generate implausible error patterns and learning trajectories due to their excessive capabilities. To mitigate this, the authors reframe student simulation as a constrained generation problem and propose Explicit State Specification (ESS) to govern the model’s knowledge access, error structures, and state evolution. They further introduce a goal–environment framework to delineate behavioral objectives and deployment contexts. Centered on cognitive fidelity, the work systematically defines design dimensions and evaluation criteria for effective student simulators, synthesizes existing literature, formalizes key design elements, and identifies open challenges—including validation of effectiveness, assessment methodologies, and ethical risks—thereby establishing a theoretical foundation for developing reliable educational simulation tools.

Technology Category

Application Category

📝 Abstract

This paper presents a conceptual and methodological framework for large language model (LLM) based student simulation in educational settings. The authors identify a core failure mode, termed the"competence paradox"in which broadly capable LLMs are asked to emulate partially knowledgeable learners, leading to unrealistic error patterns and learning dynamics. To address this, the paper reframes student simulation as a constrained generation problem governed by an explicit Epistemic State Specification (ESS), which defines what a simulated learner can access, how errors are structured, and how learner state evolves over time. The work further introduces a Goal-by-Environment framework to situate simulated student systems according to behavioral objectives and deployment contexts. Rather than proposing a new system or benchmark, the paper synthesizes prior literature, formalizes key design dimensions, and articulates open challenges related to validity, evaluation, and ethical risks. Overall, the paper argues for epistemic fidelity over surface realism as a prerequisite for using LLM-based simulated students as reliable scientific and pedagogical instruments.

Problem

Research questions and friction points this paper is trying to address.

student simulation

large language models

competence paradox

epistemic fidelity

educational AI

Innovation

Methods, ideas, or system contributions that make the work stand out.

Epistemic State Specification

competence paradox

constrained generation