🤖 AI Summary
Real-time, fine-grained facial emotion recognition (FER) remains challenging due to the need for joint classification of discrete emotion categories and continuous intensity estimation across heterogeneous input modalities (static images and video streams).
Method: We propose an end-to-end FER system supporting both static and real-time video inputs, trained to recognize seven basic emotions (anger, disgust, fear, happiness, neutral, sadness, surprise) via simultaneous coarse-grained classification and continuous intensity regression. Our approach introduces a custom full-pipeline framework—including controlled data acquisition, grayscale normalization, tailored data augmentation, and a lightweight CNN architecture—optimized jointly on a proprietary dataset and two public benchmarks (CK+ and RAF-DB).
Contribution/Results: To our knowledge, this is the first work achieving concurrent high-accuracy classification (mean accuracy >80%) and interpretable intensity regression within a single model. With <50 ms inference latency per frame, the system satisfies real-time video processing requirements, demonstrating the feasibility and robustness of CNNs for industrial-grade emotion sensing.
📝 Abstract
Emotion has an important role in daily life, as it helps people better communicate with and understand each other more efficiently. Facial expressions can be classified into 7 categories: angry, disgust, fear, happy, neutral, sad and surprise. How to detect and recognize these seven emotions has become a popular topic in the past decade. In this paper, we develop an emotion recognition system that can apply emotion recognition on both still images and real-time videos by using deep learning. We build our own emotion recognition classification and regression system from scratch, which includes dataset collection, data preprocessing , model training and testing. Given a certain image or a real-time video, our system is able to show the classification and regression results for all of the 7 emotions. The proposed system is tested on 2 different datasets, and achieved an accuracy of over 80%. Moreover, the result obtained from real-time testing proves the feasibility of implementing convolutional neural networks in real time to detect emotions accurately and efficiently.