On the Language and Gender Biases in PSTN, VoIP and Neural Audio Codecs

📅 2025-06-03

📈 Citations: 0

✨ Influential: 0

career value

224K/year

🤖 AI Summary

This study identifies, for the first time, systematic fairness disparities in audio codecs across linguistic and gender dimensions—revealing that legacy PSTN codecs significantly degrade female speech reconstruction quality (p < 0.001), while neural codecs introduce a novel language bias, exhibiting heightened distortion for low-resource languages. Method: Leveraging over 2 million multilingual speech samples, we establish a cross-codec transcoding evaluation framework covering PSTN, VoIP, and neural codecs. It integrates multilingual perceptual quality metrics (e.g., DNSMOS, POLQA), rigorous statistical significance testing, and a quantitative bias measurement model. Contribution/Results: Our work fills a critical gap in audio encoding fairness research, providing the first empirical evidence of measurable socio-technical bias embedded at the telecommunications infrastructure level. It establishes both an evidence base and a methodological foundation for developing inclusive, equitable speech coding standards.

Technology Category

Application Category

📝 Abstract

In recent years, there has been a growing focus on fairness and inclusivity within speech technology, particularly in areas such as automatic speech recognition and speech sentiment analysis. When audio is transcoded prior to processing, as is the case in streaming or real-time applications, any inherent bias in the coding mechanism may result in disparities. This not only affects user experience but can also have broader societal implications by perpetuating stereotypes and exclusion. Thus, it is important that audio coding mechanisms are unbiased. In this work, we contribute towards the scarce research with respect to language and gender biases of audio codecs. By analyzing the speech quality of over 2 million multilingual audio files after transcoding through a representative subset of codecs (PSTN, VoIP and neural), our results indicate that PSTN codecs are strongly biased in terms of gender and that neural codecs introduce language biases.

Problem

Research questions and friction points this paper is trying to address.

Investigates language and gender biases in audio codecs

Analyzes bias impact on multilingual speech quality

Compares biases across PSTN VoIP and neural codecs

Innovation

Methods, ideas, or system contributions that make the work stand out.

Analyzing biases in PSTN, VoIP, neural codecs

Evaluating gender and language bias impacts

Testing 2M multilingual audio files quality

🔎 Similar Papers

No similar papers found.