From Words to Wisdom: Discourse Annotation and Baseline Models for Student Dialogue Understanding

📅 2025-11-25
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Educational research urgently requires efficient identification of discourse features signaling knowledge construction and task execution in student dialogues; however, manual annotation is costly and scale-limited, while existing NLP frameworks lack education-specific discourse analysis models. Method: We introduce the first fine-grained, knowledge-construction-focused discourse annotation dataset for educational dialogue and formulate a discourse-unit-level functional classification task. We systematically evaluate state-of-the-art LLMs—including GPT-3.5 and Llama-3.1—as baselines. Results: Experimental results reveal substantial performance gaps (best F1 = 62.3%), exposing critical limitations in current LLMs’ ability to model implicit cognitive intentions within educational contexts. This work establishes a benchmark dataset, a standardized evaluation paradigm, and methodological insights for automated educational dialogue analysis—advancing educational AI toward modeling deeper cognitive processes.

Technology Category

Application Category

📝 Abstract
Identifying discourse features in student conversations is quite important for educational researchers to recognize the curricular and pedagogical variables that cause students to engage in constructing knowledge rather than merely completing tasks. The manual analysis of student conversations to identify these discourse features is time-consuming and labor-intensive, which limits the scale and scope of studies. Leveraging natural language processing (NLP) techniques can facilitate the automatic detection of these discourse features, offering educational researchers scalable and data-driven insights. However, existing studies in NLP that focus on discourse in dialogue rarely address educational data. In this work, we address this gap by introducing an annotated educational dialogue dataset of student conversations featuring knowledge construction and task production discourse. We also establish baseline models for automatically predicting these discourse properties for each turn of talk within conversations, using pre-trained large language models GPT-3.5 and Llama-3.1. Experimental results indicate that these state-of-the-art models perform suboptimally on this task, indicating the potential for future research.
Problem

Research questions and friction points this paper is trying to address.

Automating discourse feature detection in student conversations
Addressing the gap in NLP for educational dialogue analysis
Establishing baseline models for knowledge construction prediction
Innovation

Methods, ideas, or system contributions that make the work stand out.

Annotated educational dialogue dataset for knowledge construction
Baseline models using GPT-3.5 and Llama-3.1
Automatic prediction of discourse properties per turn
🔎 Similar Papers
No similar papers found.
F
Farjana Sultana Mim
Department of Electrical and Computer Engineering, Tufts University, Medford, MA 02155, United States
Shuchin Aeron
Shuchin Aeron
Professor, Electrical and Computer Engineering, Tufts University
Signal ProcessingMachine LearningHigh-dim StatisticsOptimal Transport
E
Eric Miller
Department of Electrical and Computer Engineering, Computer Science and Biomedical Engineering, Tufts University, Medford, MA 02155, United States
K
Kristen Wendell
Department of Mechanical Engineering and Education, Tufts University, Medford, MA 02155, United States