Developing Neural Network-Based Gaze Control Systems for Social Robots

📅 2026-02-11

📈 Citations: 0

✨ Influential: 0

career value

240K/year

🤖 AI Summary

This study addresses the challenge of generating natural gaze behavior for social robots in multi-person group interactions, where contextual appropriateness is critical for effective intention understanding and engagement. To this end, the work presents the first integration of 2D screen-based and 3D virtual reality environments to collect multimodal gaze data using eye trackers and the Oculus Quest 1 headset. An end-to-end deep neural network, combining LSTM and Transformer architectures, is developed to model the temporal dynamics of human gaze in complex social scenarios. The model achieves 60% and 65% accuracy in predicting gaze direction in 2D and 3D animated settings, respectively, and is successfully deployed on a Nao robot. User evaluations demonstrate that the system significantly enhances the perceived naturalness and credibility of the robot’s gaze behavior, with particularly strong endorsement from participants experienced with robotic systems.

Technology Category

Application Category

📝 Abstract

During multi-party interactions, gaze direction is a key indicator of interest and intent, making it essential for social robots to direct their attention appropriately. Understanding the social context is crucial for robots to engage effectively, predict human intentions, and navigate interactions smoothly. This study aims to develop an empirical motion-time pattern for human gaze behavior in various social situations (e.g., entering, leaving, waving, talking, and pointing) using deep neural networks based on participants'data. We created two video clips-one for a computer screen and another for a virtual reality headset-depicting different social scenarios. Data were collected from 30 participants: 15 using an eye-tracker and 15 using an Oculus Quest 1 headset. Deep learning models, specifically Long Short-Term Memory (LSTM) and Transformers, were used to analyze and predict gaze patterns. Our models achieved 60% accuracy in predicting gaze direction in a 2D animation and 65% accuracy in a 3D animation. Then, the best model was implemented onto the Nao robot; and 36 new participants evaluated its performance. The feedback indicated overall satisfaction, with those experienced in robotics rating the models more favorably.

Problem

Research questions and friction points this paper is trying to address.

gaze control

social robots

social interaction

human gaze behavior

attention direction

Innovation

Methods, ideas, or system contributions that make the work stand out.

gaze prediction

social robots

deep neural networks