SimClass: A Classroom Speech Dataset Generated via Game Engine Simulation For Automatic Speech Recognition Research

📅 2025-06-10

📈 Citations: 0

✨ Influential: 0

career value

183K/year

🤖 AI Summary

A critical bottleneck in educational ASR is the scarcity of large-scale, high-fidelity classroom speech data; existing public datasets are small and lack realistic noise conditions, limiting model robustness in authentic teaching environments. To address this, we propose a scalable acoustic modeling framework for classroom scenes built upon game-engine-based physical audio simulation. We introduce SimClass—the first open-source synthetic classroom speech benchmark—featuring a diverse classroom noise library and child-teacher interactive speech samples. Our approach innovatively integrates physics-based acoustic simulation, speech-video temporal alignment and synthesis, child voice conversion, and YouTube educational video–driven contextual speech generation. Experiments demonstrate that SimClass closely approximates real classroom speech distributions under both clean and noisy conditions, and consistently improves ASR accuracy and noise robustness across multiple models in educational settings.

Technology Category

Application Category

📝 Abstract

The scarcity of large-scale classroom speech data has hindered the development of AI-driven speech models for education. Public classroom datasets remain limited, and the lack of a dedicated classroom noise corpus prevents the use of standard data augmentation techniques. In this paper, we introduce a scalable methodology for synthesizing classroom noise using game engines, a framework that extends to other domains. Using this methodology, we present SimClass, a dataset that includes both a synthesized classroom noise corpus and a simulated classroom speech dataset. The speech data is generated by pairing a public children's speech corpus with YouTube lecture videos to approximate real classroom interactions in clean conditions. Our experiments on clean and noisy speech demonstrate that SimClass closely approximates real classroom speech, making it a valuable resource for developing robust speech recognition and enhancement models.

Problem

Research questions and friction points this paper is trying to address.

Lack of large-scale classroom speech data for AI models

Absence of dedicated classroom noise corpus for augmentation

Need for realistic simulated classroom speech datasets

Innovation

Methods, ideas, or system contributions that make the work stand out.

Synthesize classroom noise using game engines

Pair children's speech with lecture videos

SimClass dataset mimics real classroom speech

🔎 Similar Papers

No similar papers found.

Bosch Group

Renningen, BW, DE

PhD – Generative Models for Closed-loop Synthesis

Bosch Group

Renningen, BW, DE

Research Scientist Intern, Multimodal AI (PhD)