Parameterised Quantum Circuits for Novel Representation Learning in Speech Emotion Recognition

📅 2025-01-21
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Speech emotion recognition (SER) faces challenges in distinguishing subtle emotional differences and resolving high inter-class overlap. To address these, this paper proposes a quantum-classical hybrid model integrating convolutional neural networks (CNNs) with parameterized quantum circuits (PQCs), leveraging quantum superposition and entanglement to enhance discriminative representation learning from speech features. We provide the first empirical evidence that PQCs not only reduce model parameter count by over 30% but also significantly improve SER accuracy. Moreover, we pioneer the application of quantum representation learning to multi-emotion-state modeling. The model is rigorously evaluated across three benchmark datasets—IEMOCAP, RECOLA, and MSP-Improv—demonstrating consistent superiority over purely classical baselines in both binary and multi-class SER tasks. These results validate the effectiveness and generalizability of quantum enhancement for improving SER performance.

Technology Category

Application Category

📝 Abstract
Speech Emotion Recognition (SER) is a complex and challenging task in human-computer interaction due to the intricate dependencies of features and the overlapping nature of emotional expressions conveyed through speech. Although traditional deep learning methods have shown effectiveness, they often struggle to capture subtle emotional variations and overlapping states. This paper introduces a hybrid classical-quantum framework that integrates Parameterised Quantum Circuits (PQCs) with conventional Convolutional Neural Network (CNN) architectures. By leveraging quantum properties such as superposition and entanglement, the proposed model enhances feature representation and captures complex dependencies more effectively than classical methods. Experimental evaluations conducted on benchmark datasets, including IEMOCAP, RECOLA, and MSP-Improv, demonstrate that the hybrid model achieves higher accuracy in both binary and multi-class emotion classification while significantly reducing the number of trainable parameters. While a few existing studies have explored the feasibility of using Quantum Circuits to reduce model complexity, none have successfully shown how they can enhance accuracy. This study is the first to demonstrate that Quantum Circuits has the potential to improve the accuracy of SER. The findings highlight the promise of QML to transform SER, suggesting a promising direction for future research and practical applications in emotion-aware systems.
Problem

Research questions and friction points this paper is trying to address.

Speech Emotion Recognition
Complex Emotions
Accuracy Improvement
Innovation

Methods, ideas, or system contributions that make the work stand out.

Quantum Circuits
Convolutional Neural Networks
Speech Emotion Recognition
🔎 Similar Papers
No similar papers found.
Thejan Rajapakshe
Thejan Rajapakshe
University of Southern Queensland, Australia
R
R. Rana
University of Southern Queensland, Australia
Sara Khalifa
Sara Khalifa
Associate Professor, Queensland University of Technology (QUT)
Smart WearablesInternet of ThingsEnergy Harvesting
B
Bjorn W. Schuller
Technical University of Munich, Germany, and GLAM, Imperial College London, UK