Automated Coding of Communications in Collaborative Problem-solving Tasks Using ChatGPT

📅 2024-11-15
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Manual coding of collaborative problem-solving (CPS) dialogues is labor-intensive, error-prone, and poorly scalable—hindering large-scale assessment of 21st-century competencies. Method: This study investigates the feasibility of automating CPS dialogue coding using large language models (LLMs), proposing a misclassification-feedback-driven prompt optimization framework. We systematically evaluate multiple generations of ChatGPT—including GPT-4o-mini and GPT-4o3-mini—on CPS communication coding tasks across five real-world datasets and two established coding frameworks. Contribution/Results: We report the first empirical finding that reasoning-enhanced LLMs do not necessarily outperform lightweight variants in this domain. Our approach achieves acceptable coding quality, with prompt optimization significantly improving accuracy on specific subtasks. The resulting paradigm is the first reproducible, scalable, and cross-dataset AI-assisted coding framework for educational assessment, enabling rigorous, high-throughput evaluation of collaborative problem-solving skills.

Technology Category

Application Category

📝 Abstract
Collaborative problem solving (CPS) is widely recognized as a critical 21st-century skill. Assessing CPS depends heavily on coding the communication data using a construct-relevant framework, and this process has long been a major bottleneck to scaling up such assessments. Based on five datasets and two coding frameworks, we demonstrate that ChatGPT can code communication data to a satisfactory level, though performance varies across ChatGPT models, and depends on the coding framework and task characteristics. Interestingly, newer reasoning-focused models such as GPT-o1-mini and GPT-o3-mini do not necessarily yield better coding results. Additionally, we show that refining prompts based on feedback from miscoded cases can improve coding accuracy in some instances, though the effectiveness of this approach is not consistent across all tasks. These findings offer practical guidance for researchers and practitioners in developing scalable, efficient methods to analyze communication data in support of 21st-century skill assessment.
Problem

Research questions and friction points this paper is trying to address.

Automating coding of communication data for collaborative problem-solving assessments
Evaluating ChatGPT performance variations across models and coding frameworks
Improving coding accuracy through prompt refinement in specific cases
Innovation

Methods, ideas, or system contributions that make the work stand out.

Using ChatGPT to code communication data
Performance varies by model and framework
Refining prompts improves accuracy sometimes
🔎 Similar Papers
No similar papers found.
Jiangang Hao
Jiangang Hao
Educational Testing Service
Game-based AssessmentData Science & Machine LearningAssessment of 21st Century Skills
Wenju Cui
Wenju Cui
University of Science and Technology of China
Medical Image Analysis
P
Patrick Kyllonen
ETS Research Institute , Princeton, NJ 08541, USA
E
Emily Kerzabi
ETS Research Institute , Princeton, NJ 08541, USA
L
Lei Liu
ETS Research Institute , Princeton, NJ 08541, USA
Michael Flor
Michael Flor
Educational Testing Service
Natural Language ProcessingEducational Technology