Game-Oriented ASR Error Correction via RAG-Enhanced LLM

📅 2025-09-28
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
In multiplayer online gaming scenarios, generic automatic speech recognition (ASR) systems suffer from high error rates due to short utterances, rapid speaking rates, domain-specific terminology, and strong background noise. To address these challenges, this paper proposes GO-AEC—a novel framework integrating large language models (LLMs) with retrieval-augmented generation (RAG) to construct a dynamic game-specific knowledge base. It incorporates an N-best hypothesis re-ranking module and a context-aware error correction mechanism, and innovatively introduces an LLM-driven text-to-speech (TTS) data augmentation strategy. The framework significantly enhances ASR robustness and domain adaptability. Experimental evaluation on a real-world gaming speech test set demonstrates a 6.22 percentage-point reduction in character error rate (CER) and a 29.71% relative decrease in sentence error rate (SER), validating GO-AEC’s effectiveness and state-of-the-art performance for gaming speech understanding.

Technology Category

Application Category

📝 Abstract
With the rise of multiplayer online games, real-time voice communication is essential for team coordination. However, general ASR systems struggle with gaming-specific challenges like short phrases, rapid speech, jargon, and noise, leading to frequent errors. To address this, we propose the GO-AEC framework, which integrates large language models, Retrieval-Augmented Generation (RAG), and a data augmentation strategy using LLMs and TTS. GO-AEC includes data augmentation, N-best hypothesis-based correction, and a dynamic game knowledge base. Experiments show GO-AEC reduces character error rate by 6.22% and sentence error rate by 29.71%, significantly improving ASR accuracy in gaming scenarios.
Problem

Research questions and friction points this paper is trying to address.

Corrects ASR errors in gaming voice communication
Addresses gaming-specific challenges like jargon and noise
Improves ASR accuracy using RAG-enhanced LLM framework
Innovation

Methods, ideas, or system contributions that make the work stand out.

Integrates large language models with RAG framework
Uses data augmentation strategy with LLMs and TTS
Implements dynamic game knowledge base for correction
🔎 Similar Papers
No similar papers found.