🤖 AI Summary
This study addresses the limitations of conventional AI educational tools, which typically rely on single-agent dialogic tutoring and fail to capture the benefits of multi-agent interaction inherent in human social learning. The authors propose a multi-agent learning environment comprising an LLM-based tutor and LLM-based peers to investigate collaborative learning dynamics in mathematical problem-solving and writing tasks. Through controlled experiments involving 315 participants in mathematics and 247 in writing, they demonstrate that multi-LLM collaboration significantly enhances learning outcomes—yielding higher solution accuracy and improved essay quality, respectively. Crucially, by incorporating diverse peer behaviors that intentionally include conceptual or computational errors, the framework effectively mitigates cognitive homogenization commonly induced by reliance on a single model, thereby preserving ideational diversity in the learning process.
📝 Abstract
Most AI-based educational tools today adopt a one-on-one tutoring paradigm, pairing a single LLM with a single learner. Yet decades of learning science research suggest that multi-party interaction -- through peer modeling, co-construction, and exposure to diverse perspectives -- can produce learning benefits that dyadic tutoring alone cannot. In this paper, we investigate whether multi-agent LLM configurations can enhance learning outcomes beyond what a single LLM tutor provides. We present two controlled experiments spanning distinct learning contexts. In a convergent problem-solving study ($N=315$), participants tackle SAT-level math problems in a 2$\times$2 design that varies the presence of an LLM tutor and LLM peers, each making different kinds of errors (conceptual vs.\ arithmetic); participants who interacted with both a tutor and peers achieved the highest unassisted test accuracy. In a divergent composition study ($N=247$), participants write argumentative and creative essays with either no AI assistance, a single LLM (Claude or ChatGPT), or both Claude and ChatGPT together; while both LLM conditions improved essay quality, only the two-agent condition avoided the idea-level homogeneity that single-model assistance was found to produce. Together, these studies offer one of the first controlled investigations of multi-agent LLM learning environments, probing whether the move from one-on-one AI tutoring toward richer agent configurations can unlock the collaborative and observational benefits long documented in human social learning research.