🤖 AI Summary
This work addresses the challenge of enabling robots in human-robot cohabited smart cities—such as NEOM—to achieve low-latency, energy-efficient real-time responses across multimodal perception, cognition, and interaction tasks, which conventional homogeneous computing architectures struggle to support. The authors propose a heterogeneous computing platform that integrates neuromorphic hardware (Loihi2 chips and event-based cameras) with GPU clusters: the former handles real-time perception and interaction, while the latter manages high-level language understanding and task planning. Through co-design of hardware and software, the system achieves end-to-end efficient integration. This architecture represents the first deep coupling of neuromorphic sensing with traditional AI computation, demonstrating exceptional low-latency responsiveness in a humanoid robot–human musical co-performance task, thereby establishing a new paradigm for human-robot collaboration in dynamic environments.
📝 Abstract
After Industry 4.0 has embraced tight integration between machinery (OT), software (IT), and the Internet, creating a web of sensors, data, and algorithms in service of efficient and reliable production, a new concept of Society 5.0 is emerging, in which infrastructure of a city will be instrumented to increase reliability, efficiency, and safety. Robotics will play a pivotal role in enabling this vision that is pioneered by the NEOM initiative - a smart city, co-inhabited by humans and robots. In this paper we explore the computing platform that will be required to enable this vision. We show how we can combine neuromorphic computing hardware, exemplified by the Loihi2 processor used in conjunction with event-based cameras, for sensing and real-time perception and interaction with a local AI compute cluster (GPUs) for high-level language processing, cognition, and task planning. We demonstrate the use of this hybrid computing architecture in an interactive task, in which a humanoid robot plays a musical instrument with a human. Central to our design is the efficient and seamless integration of disparate components, ensuring that the synergy between software and hardware maximizes overall performance and responsiveness. Our proposed system architecture underscores the potential of heterogeneous computing architectures in advancing robotic autonomy and interactive intelligence, pointing toward a future where such integrated systems become the norm in complex, real-time applications.