GraphEnet: Event-driven Human Pose Estimation with a Graph Neural Network

📅 2025-10-09
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Event-camera data exhibit sparsity, asynchrony, and high temporal resolution—posing challenges for conventional frame-based 2D human pose estimation. To address this, we propose the first graph neural network framework tailored for event-driven single-person 2D pose estimation. Our method models human joints as graph nodes and constructs a spatiotemporal event graph from event streams represented via line-integral encoding. We introduce a confidence-weighted pooling mechanism and an offset vector learning paradigm to effectively integrate local event density and motion direction information. Compared to frame-based approaches, our method significantly reduces computational overhead and latency, enabling real-time pose estimation at >100 Hz with strong robustness in dynamic scenes. Extensive experiments validate its efficiency on resource-constrained platforms—including mobile devices and robots—demonstrating practical viability for low-power perception tasks. The source code is publicly released to foster adoption of event-based vision in energy-efficient applications.

Technology Category

Application Category

📝 Abstract
Human Pose Estimation is a crucial module in human-machine interaction applications and, especially since the rise in deep learning technology, robust methods are available to consumers using RGB cameras and commercial GPUs. On the other hand, event-based cameras have gained popularity in the vision research community for their low latency and low energy advantages that make them ideal for applications where those resources are constrained like portable electronics and mobile robots. In this work we propose a Graph Neural Network, GraphEnet, that leverages the sparse nature of event camera output, with an intermediate line based event representation, to estimate 2D Human Pose of a single person at a high frequency. The architecture incorporates a novel offset vector learning paradigm with confidence based pooling to estimate the human pose. This is the first work that applies Graph Neural Networks to event data for Human Pose Estimation. The code is open-source at https://github.com/event-driven-robotics/GraphEnet-NeVi-ICCV2025.
Problem

Research questions and friction points this paper is trying to address.

Estimating 2D human pose using event-based cameras
Leveraging sparse event data with graph neural networks
Addressing low-latency pose estimation for resource-constrained applications
Innovation

Methods, ideas, or system contributions that make the work stand out.

Graph Neural Network for event-based pose estimation
Offset vector learning with confidence pooling
Line-based event representation for sparse data
🔎 Similar Papers
No similar papers found.
G
Gaurvi Goyal
Maastricht University & Istituto Italiano di Tecnologia
P
Pham Cong Thuong
Istituto Italiano di Tecnologia
Arren Glover
Arren Glover
Istituto Italiano di Tecnologia
Event-based visionVisual TrackingLearning
M
Masayoshi Mizuno
Sony Interactive Entertainment Inc.
Chiara Bartolozzi
Chiara Bartolozzi
Researcher, Fondazione Istituto Italiano di Tecnologia
Neuromorphic engineering