ConversAR: Exploring Embodied LLM-Powered Group Conversations in Augmented Reality for Second Language Learners

📅 2025-04-25
🏛️ CHI Extended Abstracts
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Existing AR-based language learning tools are largely confined to dyadic interaction, lacking scalable, context-aware group dialogue support for second-language (L2) learners and failing to effectively integrate embodied large language models (LLMs) with AR visual understanding. This work introduces the first multi-role collaborative AR group dialogue system, uniquely unifying a GPT-4o-powered embodied LLM agent with real-time AR scene understanding, speech recognition/synthesis, and dynamic transition subtitles—augmented with anxiety-mitigation mechanisms. A user study (n=10) demonstrates significant reductions in speaking anxiety and enhanced learner autonomy, with perceived efficacy surpassing traditional in-person peer practice. The core contributions are: (1) establishing a novel paradigm for AR-enhanced, embodied-LLM-supported L2 group dialogue; and (2) achieving unified design of multimodal real-time interaction and affective adaptation.

Technology Category

Application Category

📝 Abstract
Group conversations are valuable for second language (L2) learners as they provide opportunities to practice listening and speaking, exercise complex turn-taking skills, and experience group social dynamics in a target language. However, most existing Augmented Reality (AR)-based conversational learning tools focus on dyadic interactions rather than group dialogues. Although research has shown that AR can help reduce speaking anxiety and create a comfortable space for practicing speaking skills in dyadic scenarios, especially with Large Language Model (LLM)-based conversational agents, the potential for group language practice using these technologies remains largely unexplored. We introduce ConversAR, a gpt-4o powered AR application, that enables L2 learners to practice contextualized group conversations. Our system features two embodied LLM agents with vision-based scene understanding and live captions. In a system evaluation with 10 participants, users reported reduced speaking anxiety and increased learner autonomy compared to perceptions of in-person practice methods with other learners.
Problem

Research questions and friction points this paper is trying to address.

Exploring AR-based group conversations for L2 learners
Addressing lack of group dialogue tools in AR language learning
Reducing speaking anxiety with embodied LLM agents in AR
Innovation

Methods, ideas, or system contributions that make the work stand out.

GPT-4o powered AR for group conversations
Embodied LLM agents with scene understanding
Live captions to reduce speaking anxiety
J
Jad Bendarkawi
Princeton University
A
Ashley Ponce
Princeton University
S
Sean Chidozie Mata
Princeton University
A
Aminah Aliu
Princeton University
Y
Yuhan Liu
Princeton University
L
Lei Zhang
Princeton University
Amna Liaqat
Amna Liaqat
Princeton University
Computer ScienceEducationHuman Computer Interaction
Varun Nagaraj Rao
Varun Nagaraj Rao
Center for Information Technology Policy, Princeton University
AI AuditsLaborHCIVision-Language ModelsCS Education
Andrés Monroy-Hernández
Andrés Monroy-Hernández
Princeton University, Computer Science
Human-computer InteractionSocial Computing