Game-Oriented ASR Error Correction via RAG-Enhanced LLM

📅 2025-09-28

📈 Citations: 0

✨ Influential: 0

career value

167K/year

🤖 AI Summary

In multiplayer online gaming scenarios, generic automatic speech recognition (ASR) systems suffer from high error rates due to short utterances, rapid speaking rates, domain-specific terminology, and strong background noise. To address these challenges, this paper proposes GO-AEC—a novel framework integrating large language models (LLMs) with retrieval-augmented generation (RAG) to construct a dynamic game-specific knowledge base. It incorporates an N-best hypothesis re-ranking module and a context-aware error correction mechanism, and innovatively introduces an LLM-driven text-to-speech (TTS) data augmentation strategy. The framework significantly enhances ASR robustness and domain adaptability. Experimental evaluation on a real-world gaming speech test set demonstrates a 6.22 percentage-point reduction in character error rate (CER) and a 29.71% relative decrease in sentence error rate (SER), validating GO-AEC’s effectiveness and state-of-the-art performance for gaming speech understanding.

Technology Category

Application Category

📝 Abstract

With the rise of multiplayer online games, real-time voice communication is essential for team coordination. However, general ASR systems struggle with gaming-specific challenges like short phrases, rapid speech, jargon, and noise, leading to frequent errors. To address this, we propose the GO-AEC framework, which integrates large language models, Retrieval-Augmented Generation (RAG), and a data augmentation strategy using LLMs and TTS. GO-AEC includes data augmentation, N-best hypothesis-based correction, and a dynamic game knowledge base. Experiments show GO-AEC reduces character error rate by 6.22% and sentence error rate by 29.71%, significantly improving ASR accuracy in gaming scenarios.

Problem

Research questions and friction points this paper is trying to address.

Corrects ASR errors in gaming voice communication

Addresses gaming-specific challenges like jargon and noise

Improves ASR accuracy using RAG-enhanced LLM framework

Innovation

Methods, ideas, or system contributions that make the work stand out.

Integrates large language models with RAG framework

Uses data augmentation strategy with LLMs and TTS

Implements dynamic game knowledge base for correction

🔎 Similar Papers

ASR Error Correction using Large Language Models