SimClass: A Classroom Speech Dataset Generated via Game Engine Simulation For Automatic Speech Recognition Research

📅 2025-06-10
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
A critical bottleneck in educational ASR is the scarcity of large-scale, high-fidelity classroom speech data; existing public datasets are small and lack realistic noise conditions, limiting model robustness in authentic teaching environments. To address this, we propose a scalable acoustic modeling framework for classroom scenes built upon game-engine-based physical audio simulation. We introduce SimClass—the first open-source synthetic classroom speech benchmark—featuring a diverse classroom noise library and child-teacher interactive speech samples. Our approach innovatively integrates physics-based acoustic simulation, speech-video temporal alignment and synthesis, child voice conversion, and YouTube educational video–driven contextual speech generation. Experiments demonstrate that SimClass closely approximates real classroom speech distributions under both clean and noisy conditions, and consistently improves ASR accuracy and noise robustness across multiple models in educational settings.

Technology Category

Application Category

📝 Abstract
The scarcity of large-scale classroom speech data has hindered the development of AI-driven speech models for education. Public classroom datasets remain limited, and the lack of a dedicated classroom noise corpus prevents the use of standard data augmentation techniques. In this paper, we introduce a scalable methodology for synthesizing classroom noise using game engines, a framework that extends to other domains. Using this methodology, we present SimClass, a dataset that includes both a synthesized classroom noise corpus and a simulated classroom speech dataset. The speech data is generated by pairing a public children's speech corpus with YouTube lecture videos to approximate real classroom interactions in clean conditions. Our experiments on clean and noisy speech demonstrate that SimClass closely approximates real classroom speech, making it a valuable resource for developing robust speech recognition and enhancement models.
Problem

Research questions and friction points this paper is trying to address.

Lack of large-scale classroom speech data for AI models
Absence of dedicated classroom noise corpus for augmentation
Need for realistic simulated classroom speech datasets
Innovation

Methods, ideas, or system contributions that make the work stand out.

Synthesize classroom noise using game engines
Pair children's speech with lecture videos
SimClass dataset mimics real classroom speech
🔎 Similar Papers
No similar papers found.
Ahmed Adel Attia
Ahmed Adel Attia
University Of Maryland
J
Jing Liu
College of Education, University of Maryland
C
Carl Espy-Wilson
Electrical and Computer Engineering, University of Maryland