A Hybrid Deep Learning Framework for Emotion Recognition in Children with Autism During NAO Robot-Mediated Interaction

📅 2025-12-13

📈 Citations: 0

✨ Influential: 0

career value

212K/year

🤖 AI Summary

This study addresses the challenge of recognizing subtle emotional responses—particularly to auditory stimuli such as name-calling—in children with Autism Spectrum Disorder (ASD) during human–robot interaction (HRI) with the NAO robot. Method: We propose an end-to-end vision–geometry joint modeling framework: (i) a novel architecture integrating ResNet-50 and a three-layer Graph Convolutional Network (GCN), leveraging MediaPipe FaceMesh facial landmarks and KL-divergence–driven embedding fusion; and (ii) dual-model weighted soft labeling using DeepFace and FER for seven-class probabilistic emotion annotation. Contribution/Results: We introduce the first large-scale, real-world facial dataset of ASD children interacting with robots in India (50,000 frames, 15 participants), filling a critical gap in neurodiverse HRI affective data. Our method achieves state-of-the-art performance on fine-grained seven-class micro-expression classification, significantly improving robustness and interpretability—enabling clinically deployable, human-in-the-loop therapeutic interventions.

Technology Category

Application Category

📝 Abstract

Understanding emotional responses in children with Autism Spectrum Disorder (ASD) during social interaction remains a critical challenge in both developmental psychology and human-robot interaction. This study presents a novel deep learning pipeline for emotion recognition in autistic children in response to a name-calling event by a humanoid robot (NAO), under controlled experimental settings. The dataset comprises of around 50,000 facial frames extracted from video recordings of 15 children with ASD. A hybrid model combining a fine-tuned ResNet-50-based Convolutional Neural Network (CNN) and a three-layer Graph Convolutional Network (GCN) trained on both visual and geometric features extracted from MediaPipe FaceMesh landmarks. Emotions were probabilistically labeled using a weighted ensemble of two models: DeepFace's and FER, each contributing to soft-label generation across seven emotion classes. Final classification leveraged a fused embedding optimized via Kullback-Leibler divergence. The proposed method demonstrates robust performance in modeling subtle affective responses and offers significant promise for affective profiling of ASD children in clinical and therapeutic human-robot interaction contexts, as the pipeline effectively captures micro emotional cues in neurodivergent children, addressing a major gap in autism-specific HRI research. This work represents the first such large-scale, real-world dataset and pipeline from India on autism-focused emotion analysis using social robotics, contributing an essential foundation for future personalized assistive technologies.

Problem

Research questions and friction points this paper is trying to address.

Recognizing emotions in autistic children during robot interaction

Developing a hybrid deep learning model for emotion analysis

Addressing the gap in autism-specific human-robot interaction research

Innovation

Methods, ideas, or system contributions that make the work stand out.

Hybrid CNN-GCN model processes facial and geometric features

Soft-label generation via ensemble of DeepFace and FER models

Fused embedding optimized with Kullback-Leibler divergence for classification

🔎 Similar Papers

No similar papers found.

💼 Related Jobs

AI Research Scientist, Robotics