đ€ AI Summary
Existing multimodal foreign language learning dialogue systems lack validated instruments for assessing user experience. Method: This study develops and validates a dual-dimensional scale measuring user engagement and rapport, integrating theories from educational psychology, social psychology, and second language acquisition to uniquely distinguish human tutor versus AI agent experiences. Rigorous psychometric evaluationâincluding Cronbachâs α analysis, confirmatory factor analysis (CFA), and humanâAI comparative experimentsâwas conducted. Contribution/Results: The scale demonstrates excellent reliability (α > 0.90) and construct validity (CFI > 0.95). Empirical findings reveal systematic differences between human and AI interactions across subdimensionsâincluding task focus, affective responsiveness, and trust formationâhighlighting critical design implications. The validated instrument provides a reusable theoretical framework and measurement benchmark for iterative UX optimization and evaluation of multimodal educational dialogue systems.
đ Abstract
This study aimed to develop and validate two scales of engagement and rapport to evaluate the user experience quality with multimodal dialogue systems in the context of foreign language learning. The scales were designed based on theories of engagement in educational psychology, social psychology, and second language acquisition.Seventy-four Japanese learners of English completed roleplay and discussion tasks with trained human tutors and a dialog agent. After each dialogic task was completed, they responded to the scales of engagement and rapport. The validity and reliability of the scales were investigated through two analyses. We first conducted analysis of Cronbach's alpha coefficient and a series of confirmatory factor analyses to test the structural validity of the scales and the reliability of our designed items. We then compared the scores of engagement and rapport between the dialogue with human tutors and the one with a dialogue agent. The results revealed that our scales succeeded in capturing the difference in the dialogue experience quality between the human interlocutors and the dialogue agent from multiple perspectives.