GeHirNet: A Gender-Aware Hierarchical Model for Voice Pathology Classification

📅 2025-08-01

📈 Citations: 0

✨ Influential: 0

career value

186K/year

🤖 AI Summary

Speech pathology classification faces two key challenges: gender-related acoustic bias and severe class imbalance due to scarcity of rare disorder samples. To address these, we propose a gender-aware hierarchical modeling framework comprising two stages: (1) accurate speaker gender identification and extraction of gender-specific acoustic features; and (2) gender-conditioned disease classification. We further introduce novel multi-scale resampling and time-warping data augmentation strategies to mitigate both bias and imbalance. Our model employs ResNet-50 for Mel-spectrogram analysis and is trained on a unified corpus comprising four public datasets. It achieves 97.63% accuracy and 95.25% Matthews Correlation Coefficient (MCC), outperforming the single-stage baseline by 5.0 percentage points—setting a new state-of-the-art. This advancement significantly enhances the clinical viability of AI-driven speech pathology diagnosis.

Technology Category

Application Category

📝 Abstract

AI-based voice analysis shows promise for disease diagnostics, but existing classifiers often fail to accurately identify specific pathologies because of gender-related acoustic variations and the scarcity of data for rare diseases. We propose a novel two-stage framework that first identifies gender-specific pathological patterns using ResNet-50 on Mel spectrograms, then performs gender-conditioned disease classification. We address class imbalance through multi-scale resampling and time warping augmentation. Evaluated on a merged dataset from four public repositories, our two-stage architecture with time warping achieves state-of-the-art performance (97.63% accuracy, 95.25% MCC), with a 5% MCC improvement over single-stage baseline. This work advances voice pathology classification while reducing gender bias through hierarchical modeling of vocal characteristics.

Problem

Research questions and friction points this paper is trying to address.

Classifying voice pathologies despite gender-related acoustic variations

Addressing data scarcity for rare diseases in voice pathology detection

Reducing gender bias in AI-based voice pathology classification

Innovation

Methods, ideas, or system contributions that make the work stand out.

Two-stage gender-aware hierarchical classification model

ResNet-50 on Mel spectrograms for pattern detection

Multi-scale resampling and time warping augmentation

🔎 Similar Papers

No similar papers found.

💼 Related Jobs

Machine Learning Engineer - Health AIML

Apple

Cupertino, United States of America

AI Research Scientist - Meta Superintelligence Labs (PhD)